MX as part of our interactions with the Biodiversity Heritage Library (BHL). We want to collect new terms parsed from Journal of Hymenoptera Research (JHR) articles OCRed on BHL.
But then we ran into the first issue: JHR literature citations are not available in a nice format we can import (based on article, BHL has them in Endnote based on volume). Ideally in the end it will be published (in part) as my first gem on Github (results of project #1). The logic for creating, or perhaps better termed the justification for creating, an Endnote parser primarily has to do with Google. We at the Hymenotpera Anatomy Project are adding lots of references and MX reference addition is form heavy (unavoidable)...that is if you type it all in. Google Scholar exports references in Endnote. Thus the proposed work flow is something like this:
- I have a citation I want to enter
- I Google it
- Cut the Endnote file
- Paste and verify in MX
It will be most useful if we can then export all references in Endnote as well (or perhaps some other library friendly formats?). That way we can return the nicely formatted references for those who need them (including JHR and maybe BHL).
What I would really like to see is BHL OCR returned to me based on pages. I know you can already get by asking BHL to email you the pages, and rumor has it that a wrapper is being written to hack just this, but it would be lovely to access it directly without the hack.