parse amazon data
Bug #152793 reported by
Aaron Swartz
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Open Library |
Confirmed
|
Medium
|
Edward Betts |
Bug Description
There are around 6M Amazon books now up at:
http://
They should be parsed and eventually integrated. (Also, there are another million or so since the last time you grabbed the ISBNs from here.)
Changed in openlibrary: | |
assignee: | nobody → edward-debian |
importance: | Undecided → High |
milestone: | none → launch |
status: | New → Confirmed |
Changed in openlibrary: | |
importance: | High → Medium |
To post a comment you must log in.
The catalog.txt file contains duplicates, for example:
0002165163 1 Amazon.com: Spinner's yarn: Books: Ian Alexander Ross Peebles
0002165163 o-0 Amazon.com: Spinner's yarn: Books: Ian Alexander Ross Peebles
0002165171 1 Amazon.com: Memoirs: Books: Jean Monnet
0002165171 o-0 Amazon.com: Memoirs: Books: Jean Monnet
000216518X 1 Amazon.com: Media Mob: Books: George Melly
000216518X o-0 Amazon.com: Media Mob: Books: George Melly
000216521X 1 Amazon.com: Old Glory an American Voyage: Books: Johnathan Raban
000216521X o-0 Amazon.com: Old Glory an American Voyage: Books: Johnathan Raban
0002165252 1 404 - Document Not Found
0002165252 o-0 404 - Document Not Found