Version | Input | Output | Model Weights | Homepage | Paper |
---|---|---|---|---|---|
v1 | Title & Abstract | Impact Estimation (0–1) | Link | Link | AAAI 2025 |
v2 | Title & Abstract | Quality Estimation | Link | Link | arXiv |
- 250930 – Introducing NAIPv2: extending the series with an emphasis on quality estimation.
- 241210 - The paper has now been accepted by AAAI 2025!
- 241204 - Huggingface Spaces Support🥰
- We've set up an online demo on Hugging Face Spaces—now you can easily give it a try without writing a single line of code!
- 241126 - V1.0 We’re thrilled to announce the end of Early Access and the official release of V1.0! ✨
- The codebase is now more organized and easier to navigate! 🧹
- Updated and streamlined README with detailed instructions for setup and usage. 💡
- Decoupling the dataset, more LoRa adapters weight download links, and more! 🔄
- Known Issues: The functionality for building the NAID dataset has not been tested on other machines, which may lead to potential issues. We plan to replace this function with a more powerful framefowk in our another codebase.
- 240808 - Eerly Access
- We have released the Early Access version of our code!
First, pull the repo and type following commands in the console:
git clone https://github.com/ssocean/NAIP.git
cd NAIP
pip install -r requirements.txt
- To try v1, please use
demo_v1.py
. - To try v2, please use
demo_v2.py
. - You may need to download the corresponding model weights.
- When providing the title and abstract, please avoid line breaks, LaTeX symbols, or other special formatting.
The following instructions are outdated. We are undergoing a major code refactoring. An updated version will be released after 2025.10.7.
For fine-tuning, you may manually modify the 'xxxForSequenceClassification' in the transformers
package (see llama_for_naip/NAIP_LLaMA.py for more details). Or follow the instruction to use custom code.
Then, prepare train.sh
bash file like below:
DATA_PATH="ScImpactPredict/NAID/NAID_train_extrainfo.csv"
TEST_DATA_PATH="ScImpactPredict/NAID/NAID_test_extrainfo.csv"
OMP_NUM_THREADS=1 accelerate launch offcial_train.py \
--total_epochs 5 \
--learning_rate 1e-4 \
--data_path $DATA_PATH \
--test_data_path $TEST_DATA_PATH \
--runs_dir official_runs/LLAMA3 \
--checkpoint path_to_huggingface_LLaMA3
Finally, type sh train.sh
in the console. Wating for the training ends~
Similar to fine-tuning, prepare test.sh
as below:
python official_test.py \
--data_path NAIP/NAID/NAID_test_extrainfo.csv \
--weight_dir path_to_runs_dir
Then, type sh test.sh
.
Preliminary code and dataset are released at ./v2_resource, detailed instructions will be released after 2025.10.7. 🚀 (Core team members are on vacation 🏖️)
If you would like to conduct comparison experiments with NAIP but encounter difficulties in setting up the environment or reproducing the code, we provide free technical support.
Simply send us a .csv
file containing the "title" and "abstract" fields, and we will return the prediction results to you.
- In urgent cases, results can be provided within one day.
- This service is free of charge and intended to facilitate fair, reproducible comparisons in research.
📩 Please contact us via [[email protected]].
If you find this work useful, please cite:
@article{Zhao2024NAIP,
title={From Words to Worth: Newborn Article Impact Prediction with LLM},
author={Penghai Zhao and Qinghua Xing and Kairan Dou and Jinyu Tian and Ying Tai and Jian Yang and Ming-Ming Cheng and Xiang Li},
journal={ArXiv},
year={2024},
volume={abs/2408.03934},
url={https://api.semanticscholar.org/CorpusID:271744831}
}