No description

Find a file

David Adams b883f1e242 add docs and release files		2026-03-14 13:27:41 +11:00
docs	add docs and release files	2026-03-14 13:27:41 +11:00
.gitignore	add docs and release files	2026-03-14 13:27:41 +11:00
.python-version	initial working version	2024-12-02 17:34:31 +11:00
CONTRIBUTING.md	add docs and release files	2026-03-14 13:27:41 +11:00
LICENSE	add docs and release files	2026-03-14 13:27:41 +11:00
main.py	upgrading to actually download papers	2025-02-01 22:02:44 +11:00
pyproject.toml	add docs and release files	2026-03-14 13:27:41 +11:00
README.md	add docs and release files	2026-03-14 13:27:41 +11:00
requirements.txt	add docs and release files	2026-03-14 13:27:41 +11:00
summarizer.py	fulltext pdf	2026-03-12 19:55:24 +11:00
uv.lock	add docs and release files	2026-03-14 13:27:41 +11:00

Career Summarizer

Small Python project that builds academic career reports from publication metadata and paper text.

The workflow is:

Requirements

Quick Start

Using uv:

uv sync
uv run python main.py --max_papers 20

Using pip:

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python main.py --max_papers 20

Output

Notes

Some papers do not expose direct PDF links and will be skipped.
First run can be slow due to embedding/model setup.
This project depends on third-party data sources and model output; review generated reports before use.

Project Structure

main.py: CLI entry point.
summarizer.py: publication collection, PDF processing, vector store creation, and report generation.
names.txt: input researcher names.
reports/: generated reports.
papers/: downloaded papers and per-researcher vector stores.

License

This project is licensed under the MIT License. See LICENSE.