Comment 8 for bug 222340

Revision history for this message
John S (jcspray) wrote :

What I was referring to is this: the DOI comes from the PDF's plain text, and it is not always possible to get out the correct DOI. The expression used is roughly "doi xx.xx/xx". where the x can be any character other than whitespace. This works on a lot of PDFs, but if there are spurious characters such as a closing parenthesis at the end of the DOI then there's no simple way to tell whether they're part of the code or not.

Anyway, I took another look at the code and added a special case for (doi:xx/xx) to remove the trailing parenthesis. Time will tell whether that breaks it for anything else.