Framework for newborn article impact & quality estimation.

Overview

The NAIP series uses fine-tuned LLMs to quickly predict the **impact** or **quality** of articles based on their internal content.

Version	Input	Output	Model Weights	Homepage	Paper
v1	Title & Abstract	Impact Estimation (0–1)	Link	Link	AAAI 2025
v2	Title & Abstract	Quality Estimation	Link	Link	arXiv

🚀 Update Log

250930 – Introducing NAIPv2: extending the series with an emphasis on quality estimation.
241210 - The paper has now been accepted by AAAI 2025!
241204 - Huggingface Spaces Support🥰
- We've set up an online demo on Hugging Face Spaces—now you can easily give it a try without writing a single line of code!
241126 - V1.0 We’re thrilled to announce the end of Early Access and the official release of V1.0! ✨
- The codebase is now more organized and easier to navigate! 🧹
- Updated and streamlined README with detailed instructions for setup and usage. 💡
- Decoupling the dataset, more LoRa adapters weight download links, and more! 🔄
- Known Issues: The functionality for building the NAID dataset has not been tested on other machines, which may lead to potential issues. We plan to replace this function with a more powerful framefowk in our another codebase.
240808 - Eerly Access
- We have released the Early Access version of our code！

Quick Deployment (for most researchers)

First, pull the repo and type following commands in the console:

git clone https://github.com/ssocean/NAIP.git
cd NAIP
pip install -r requirements.txt

To try v1, please use demo_v1.py.
To try v2, please use demo_v2.py.
You may need to download the corresponding model weights.
When providing the title and abstract, please avoid line breaks, LaTeX symbols, or other special formatting.

Reproducing NAIPv1 (optional)

The following instructions are outdated. We are undergoing a major code refactoring. An updated version will be released after 2025.10.7.

Fine-tuning

For fine-tuning, you may manually modify the 'xxxForSequenceClassification' in the transformers package (see llama_for_naip/NAIP_LLaMA.py for more details). Or follow the instruction to use custom code.

Then, prepare train.sh bash file like below:

DATA_PATH="ScImpactPredict/NAID/NAID_train_extrainfo.csv"
TEST_DATA_PATH="ScImpactPredict/NAID/NAID_test_extrainfo.csv"

OMP_NUM_THREADS=1 accelerate launch offcial_train.py \
    --total_epochs 5 \
    --learning_rate 1e-4 \
    --data_path $DATA_PATH \
    --test_data_path $TEST_DATA_PATH \
    --runs_dir official_runs/LLAMA3 \
    --checkpoint  path_to_huggingface_LLaMA3

Finally, type sh train.sh in the console. Wating for the training ends~

Testing

Similar to fine-tuning, prepare test.sh as below:

python official_test.py \
 --data_path NAIP/NAID/NAID_test_extrainfo.csv \
 --weight_dir path_to_runs_dir

Then, type sh test.sh.

Reproducing NAIPv2 (optional)

Preliminary code and dataset are released at ./v2_resource, detailed instructions will be released after 2025.10.7. 🚀 (Core team members are on vacation 🏖️)

🛠️ Technical Support

If you would like to conduct comparison experiments with NAIP but encounter difficulties in setting up the environment or reproducing the code, we provide free technical support.

Simply send us a .csv file containing the "title" and "abstract" fields, and we will return the prediction results to you.

In urgent cases, results can be provided within one day.
This service is free of charge and intended to facilitate fair, reproducible comparisons in research.

📩 Please contact us via [[email protected]].

📚 Citation

If you find this work useful, please cite:

@article{Zhao2024NAIP,
  title={From Words to Worth: Newborn Article Impact Prediction with LLM},
  author={Penghai Zhao and Qinghua Xing and Kairan Dou and Jinyu Tian and Ying Tai and Jian Yang and Ming-Ming Cheng and Xiang Li},
  journal={ArXiv},
  year={2024},
  volume={abs/2408.03934},
  url={https://api.semanticscholar.org/CorpusID:271744831}
}

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.idea		.idea
img		img
tools		tools
v1_resource		v1_resource
v2_resource		v2_resource
.gitignore		.gitignore
README.md		README.md
demo_v1.py		demo_v1.py
demo_v2.py		demo_v2.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Framework for newborn article impact & quality estimation.

Overview

🚀 Update Log

Quick Deployment (for most researchers)

Reproducing NAIPv1 (optional)

The following instructions are outdated. We are undergoing a major code refactoring. An updated version will be released after 2025.10.7.

Fine-tuning

Testing

Reproducing NAIPv2 (optional)

Preliminary code and dataset are released at ./v2_resource, detailed instructions will be released after 2025.10.7. 🚀 (Core team members are on vacation 🏖️)

🛠️ Technical Support

📚 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

ssocean/NAIP

Folders and files

Latest commit

History

Repository files navigation

Framework for newborn article impact & quality estimation.

Overview

🚀 Update Log

Quick Deployment (for most researchers)

Reproducing NAIPv1 (optional)

The following instructions are outdated. We are undergoing a major code refactoring. An updated version will be released after 2025.10.7.

Fine-tuning

Testing

Reproducing NAIPv2 (optional)

Preliminary code and dataset are released at ./v2_resource, detailed instructions will be released after 2025.10.7. 🚀 (Core team members are on vacation 🏖️)

🛠️ Technical Support

📚 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages