cleaning up bibtex files.
Mike Marchywka
marchywka at hotmail.com
Sun Sep 22 18:50:34 CEST 2019
I had various versions and bugs in my bibtex downloading scripts
so I wrote some code to go through and clean them up.
In the process of accumulating them however it looks like there is a lot
of variation in how they are supplied by publishers.
I guess I was curious if there is some prefered format as long as I
have gone to this much effort. My biggest concern was making sure everything
I got off the web was indicated as such with a url and I wanted to preserve
download time to retrace what I was doing. I put all of this in "comments"
but curious if putting it into the bibtex would hurt anything.
In particular, the field values seem to randomly be quoted or braced and
I just made them all braced. Does this lose something?
For example, I output stuff like this,
% programmatically fixed probably bu toobib
% loaded from test2.bib written on 2019-09-22:11:25:04
% srcurl: https://www.pnas.org/content/pnas/111/28/10257.full.pdf
% citeurl: http://api.crossref.org/works/10.1073/pnas.1409284111/transform/appl
ication/x-bibtex
% med2bib comment: handledoi
% date Wed Feb 6 01:02:49 UTC 2019
@article{Nikoh_2014,
doi = {10.1073/pnas.1409284111},
url = {https://doi.org/10.1073\%2Fpnas.1409284111},
year = {2014},
month = {jun},
publisher = {Proceedings of the National Academy of Sciences},
volume = {111},
number = {28},
pages = {10257--10262},
author = {N. Nikoh and T. Hosokawa and M. Moriyama and K. Oshima and M. Hatt
ori and T. Fukatsu},
title = {Evolutionary origin of insect-Wolbachia nutritional mutualism},
journal = {Proceedings of the National Academy of Sciences}
}
@article{KRAMER201895,
title = {Wolbachia, doxycycline and macrocyclic lactones: New prospects in t
he treatment of canine heartworm disease},
journal = {Veterinary Parasitology},
volume = {254},
pages = {95 - 97},
year = {2018},
issn = {0304-4017},
doi = {https://doi.org/10.1016/j.vetpar.2018.03.005},
url = {http://www.sciencedirect.com/science/article/pii/S0304401718301055},
author = {L. Kramer and S. Crosara and G. Gnudi and M. Genchi and C. Mangia
and A. Viglietti and C. Quintavalla},
keywords = {, , Doxycycline, Macrocyclic lactones}
}
and an excerpt of the diff output gives almost all of the input,
often due to a quote to brace change.
( and I went to a lot of effort to perserve the field order instead of alpha lol )
diff -b test2.bib check2.bib
< title = "Wolbachia, doxycycline and macrocyclic lactones: New prospects in the
treatment of canine heartworm disease",
< journal = "Veterinary Parasitology",
< volume = "254",
< pages = "95 - 97",
< year = "2018",
< issn = "0304-4017",
< doi = "https://doi.org/10.1016/j.vetpar.2018.03.005",
< url = "http://www.sciencedirect.com/science/article/pii/S0304401718301055",
< author = "L. Kramer and S. Crosara and G. Gnudi and M. Genchi and C. Mangia an
d A. Viglietti and C. Quintavalla",
< keywords = ", , Doxycycline, Macrocyclic lactones"
I would imagine there are other utilities that clean these up but I also wanted to
structure the comments in a custom way although I could just include the
srcurl as a bibtex entry field.
I started to write a command line interactive fixer and thought if there were other
common problems I could include all of that now.
Thanks.
--
mike marchywka
306 charles cox
canton GA 30115
USA, Earth
marchywka at hotmail.com
404-788-1216
ORCID: 0000-0001-9237-455X
More information about the texhax
mailing list