Biocurious is a weblog about biology, quantified.

PDF organisation

by PhilipJ on 5 September 2007

More ask the audience questions: How do you manage your electronic archive of papers? The dead-tree versions I leave in binders, but I’m having a harder time deciding how to organise my PDF library. Given that I’m starting in a new field for my Ph.D., it is the perfect opportunity to get things right from the start. So how do you do it?

So far I’ve been organising them as I would my paper copies: into various sub folders in a Papers directory (Experiments, Theory, Instrumentation…) that give a general idea of what the papers are about. When it comes to actually retrieving an article I want to read again, however, I have so far been using the built in Spotlight tool in Mac OS X. If you remember at least a few of the words in the paper, and an author name, chances are you can find what you’re looking for.

This breaks down, however, with old papers. These are normally scanned in from paper copies, and it isn’t possible to search for words within the documents any longer. I don’t have that many old papers, but it does make it more difficult than I’d like to find what I’m looking for, particularly when I have a bunch of old papers all on similar topics by the same people.

I’ve given Papers a try, which in principle seems like a nice way to manage everything, but in practice I found frustrating to use (and it’s also not free, which seems like questionable policy given that grad students will be the main adopters of something like this!). I also thought a bit about web applications like Connotea, but if I’m ever without internet connectivity, there goes all ability to search through my papers. It also doesn’t seem to keep a copy of the PDF of the paper for me. Furthermore, it requires that I tag the papers with metadata myself, whereas Spotlight does this automatically for me.

Chances are there will be no “perfect” solution, but I’m eager to hear how others keep track of their PDFs. And those who have the hundreds-of-PDFs-on-my-desktop strategy, no thanks!



  1. result.hariadi    3549 days ago    #

    I have been using BibDesk ( http://bibdesk.sourceforge.net/ ) for about a year to organize my papers. It is free and really good.

    BibDesk has so many good features that will talk a lot of space to explain here. You should try it out.


  2. Enro    3549 days ago    #

    Spotlight rocks, I agree! But for managing your references, I would suggest BibDesk like above. In addition, if you use CiteULike, you may want to read that interesting page: http://phnk.com/blog/tech/citeulike-and-bibdesk/


  3. PhilipJ    3549 days ago    #

    BibDesk, of course! I think I banished it from my mind the minute I handed in the final copy of my MSc thesis (even though the icon is still happily sitting in my Dock…).

    Manual tagging unfortunately, but you can put in urls to PDFs on your drive, and the searching tool works quite well. This seems like the best of all worlds so far.


  4. Sour Grapes    3549 days ago    #

    Why not mail each paper to a GMail account, as an attachment to a short mail message which contains as many keywords as you think you might need to be able to find it again under any circumstances. File it using more general tags. When you need to find it, search on one or more of the keywords, which since they form part of the message and not part of the document, will be open to a GMail search.

    Added benefit: they’re all stored online so you can access them even when you’re not at your own computer. If you ever want to add keywords, just reply to your own original message and put them in there.


  5. sam    3549 days ago    #

    How about some PDF organizing programs for PC users (my lab uses MS Windows or Linux exclusively)?

    I like CiteULike for organizing articles: I think it’s better than Connotea right now, and you can upload PDFs. But I agree that this depends on an internet connection and relies on my good tagging to begin with.

    A way around the first problem: Zotero saves stuff right on your desktop.


  6. arun    3549 days ago    #

    I use JabRef. It is free, open source and is written in Java. I have found it immensely useful.


  7. MadGenius    3548 days ago    #

    KBibTex for those on Linux (http://www.unix-ag.uni-kl.de/~fischer/kbibtex/index.html)


  8. PIerre    3548 days ago    #

    I think that you can save your pdf with http://www.citeulike.org which is an equivalent of connotea.

    see also:
    http://www.connotea.org/uri/c54ca32365e3045e87467a1e8c35ca76
    http://www.mauropiccini.it/projects/shoka/index.php/about/
    http://www.scribd.com/
    http://mekentosj.com/papers/
    http://ipapers.sourceforge.net/iPapers.html

    Pierre


  9. Fred Ross    3548 days ago    #

    Don’t bother with Connotea. It’s a commercial ripoff of CiteULike (www.citeulike.org), which works very well and has automatic posting from a huge range of journals and services. It also has some very interesting collaboration tools.

    Not having access to your papers when you’re offline sounds a lot more intimidating than it really is. I have only wanted a reference once or twice in the past few years and not been able to get to it (and I don’t have Internet access at home).


  10. Andre    3548 days ago    #

    I’m also doing the put-in-folders-search-with-spotlight thing. So far I’ve been happy enough with it that I haven’t been pushed to change to something better. I should really check out some of these alternatives though… someday…


  11. Ricardo Vidal    3548 days ago    #

    There is another Mac application that is made specifically for PDF organizing and it’s called “Yep”.
    It looks pretty decent but I can’t give you a personal review since I’m a PC user…boo!!

    You can find the app here:
    http://www.yepthat.com/

    Note: I’m in no way affiliated to this site or app.


  12. Terri Yu    3547 days ago    #

    I also use Jabref and tag each article with keywords that are searchable. The nice thing about Jabref is that it is just a GUI to editing the BibTeX file. You can always open up the BibTeX file with another program or even just look at the text. You could even search the text for keywords using grep or some Unix command. So no worries about compatability in the future.


  13. Stew    3543 days ago    #

    Upfront disclaimer: I work for NPG, who develop Connotea.

    Fred:
    “Don’t bother with Connotea. It’s a commercial ripoff of CiteULike (www.citeulike.org)”

    Hey! That’s a bit unfair. CiteULike is every bit as commercial as Connotea. If anything it’s more so: Connotea is completely open source and its data is accessible through an API, unlike CuL.

    Also… Connotea is not a ripoff of CiteULike. They started around the same time and are both ripoffs of del.icio.us. ;)

    Anyway…. more pertinently:

    What about Zotero? Can’t remember if that stores PDFs or not but you can certainly access your references offline even though you’re collecting them while browsing.


  14. Eli    3539 days ago    #

    If you haven’t already committed to one of these others, you may really want to try out Zotero. It was originally developed for work in history and he humanities but I find it to be a more flexible and powerful option than any of the others mentioned so far. It stores unlimited notes, pdfs, or other files for each bibliographic entry. You can also tag entries or organize by folders.

    http://www.zotero.org/

    Quick intro screencast.
    http://www.zotero.org/videos/tour/zotero_tour.htm


  15. PhilipJ    3539 days ago    #

    Unfortunately Zotero seems to be Firefox-specific, which I unfortunately don’t use. I’d also be worried about needing to have my web browser open while trying to write my thesis. Keeping the browser closed is the only way I get any work done. :)

    Thanks for the suggestions everyone, and if there are others I don’t know about, keep ‘em coming!


  16. fishcake    3539 days ago    #

    i prefer pdfs to be local so i can read papers anywhere i have my laptop. i organize them in general topic folders.

    what bothers me is that full text pdfs of articles always have useless file names (so those have to be changed manually). they also never have any useful metadata (pdfs have title, author, etc, fields but journals rarely use them). i manually paste this information in, because windows explorer will show columns with that metadata.

    it sucks.


  17. Anders    3528 days ago    #

    Hi all,

    is there any of these online archive-services that interfaces well with EndNote ?

    I wouldn’t like to first tag and insert all papers into some online service and then go through the same trouble when I want to cite that paper using EndNote.


  18. PhilipJ    3528 days ago    #

    Any EndNote junkies that can give Anders a hand? I don’t know anything about it myself.


  19. JoshL    3525 days ago    #

    As near as I can tell, there is no acceptable PDF-organizer available for the PC. I make do with the does-everything utility called Omea Pro (freeware). Its range of facilities is daunting, but if you use it for nothing but the file indexing and preview capability, entirely worth it. (Has a nice RSS and web-notating feature, and if you want, will run your life…) I use it for an overview of my local PDFs, on which I can perform text searches. Double clicking launches, my default viewer, PDF-Xchange viewer (freeware, and in every way superior to devil’s-spawn Adobe Reader). Xchange has a “typewriter” feature which allows you to create annotations on a PDF which are saved with the file (!!!) — hence if you use this for internal tags (say “bioluminescence”), that term will come up when searching text in Omea (or anything that will search inside PDFs — Google desktop search, Copernic, X1, etc.)

    Alternatively, something like XYplorer (freeware file-manager Explorer replacement) will let you preview and search inside PDFs; use with trick of Xchange Viewer above to organize articles.

    Still waiting for “Papers” to be ported to PC, but authors are hardcore Mac people so highly unlikely.

    Hope this is helpful.


Name
Email
http://
Message
  Textile help