SystematicReviewer (open-source) | OpenAI | Page 1

spark flame Jan 9, 2023, 6:54 AM

#

To help automate the process of conducting systematic reviews of the scientific literature on a topic, we've created https://github.com/scottleibrand/SystematicReviewer, which automates the process of using GPT-3 to systematically answer your list of questions against a list of paper URLs.

To do so, download_articles_and_embeddings.py first takes as input a papers.csv with a list of paper URLs, downloads the full text HTML or PDF, splits it into sections, generates embeddings for each section, and stores the results.
Then answer_questions.py takes as input a questions.csv of questions to answer (or a single question string), calculates the embedding for each question, finds the top_n sections most relevant to the question, and then feeds those sections (plus the title and abstract, if provided in the papers.csv) to InstructGPT to answer them, and then combines the top_n answers into a single answer for each question. After all questions are answered, it writes a copy of the original papers.csv with additional answer columns.

pseudo grove Jan 9, 2023, 10:21 AM

#

Thanks! This is great.

spark flame Jan 10, 2023, 5:21 AM

#

@quasi saffron if you want to try your hand at adding new features (or more likely instructing ChatGPT to do so), there are a couple of TODOs in the code for reusing questions that have already been asked and answered (for example if you add more paper URLs or questions, it shouldn’t need to re-run all the ones it has already done).

quasi saffron Jan 10, 2023, 5:36 AM

#

Thank you very much!

cloud loom Jan 11, 2023, 11:56 PM

#

👍 This looks great. I shud try this.

hushed perch Jan 12, 2023, 1:25 AM

#

Beautiful implementation! For the part of reusing questions, maybe one can fine tune a model with the valid question answer set from a run so it doesn’t go back and rerun all the things it’s already done. Let me know what you think. 🙂

spark flame Jan 12, 2023, 1:36 AM

#

hushed perch Beautiful implementation! For the part of reusing questions, maybe one can fine ...

Fine tuned models require a lot of cycles to train, and cost 6x as much to run for inference. So I don’t think that will make sense from a cost perspective.

hollow dock Feb 9, 2023, 2:06 AM

#

Hi can you make a youtube video for tutorial purpose? It would be nice for people like me how are not familiar, which is huge

hollow dock Feb 12, 2023, 11:13 AM

#

Can someone provide a more detailed guide on how to do it?

#SystematicReviewer (open-source)