Wondering if a long article/paper is worth reading in full or in part? We all know that GPT-3 text-davinci-003 is good at summarization. But I still spend way too much time reading/skimming articles and papers, so I decided to make a tool to help with that. This weekend ChatGPT, Copilot, and I created an open-source summarizer tool to provide both an Overall Summary and more detailed Section Summaries of any paper/article/website: https://github.com/scottleibrand/gpt-summarizer
The script extracts text from a given file or URL and splits it into sections. It writes the extracted text and each section to a separate text file. It generates and writes out a summary for each subsection, and a combined section summary if needed.
It splits the document into subsections of 1000-3000 tokens based on HTML section headings or numbered section headings. It uses text-davinci-003 to generate and write section summaries, and then summarizes the lower-level summaries to produce an overall summary.
It's currently implemented as a python script you can run from the command line. Remember to set your OpenAI API key as an environment variable.
The intended use is that both the overall summary and the subsection summaries are worth reading (in order) to determine whether to spend the time reading the entire article/paper, or specific sections of it.
It also might also form the basis for a summarization service, such as for anyone doing literature reviews, reading lots of papers/articles for research, etc. With enough users, it'd be worthwhile to cache generated summaries and make them accessible to everyone.
It might even be the basis for an intelligent reader app, which would allow you to smoothly zoom in from an article headline/subhead through AI-generated overall and section summaries all the way down to the article text itself.
Have any other ideas for potential applications? Want to try it yourself? Download a copy, give it a try, and share your thoughts and experiences!