Task for the ATD course - DWS MSc Spring 2022
- Create a crawler to get articles and save them in csv files
- Add them to Postgres
- Connect Postgres with Python using a connector (psycopg3)
- Read credentials from config file
- Add create directory if not exists in
extract_body.py - Fix
article_path.csv - Add threshold to relevant docs in
text_query.py - Add columns to show in
text_query.py - Show lines that have keywords (grep maybe?)
- Add
requirements.txt - Fix - In
text_query.py:301 -> check if list empty - Move
links.csvtocsv_files -
Add show vector intext_query.pyoutput - Use GIN index on docvec column
- Displaying docvec troublesome in terminal
- Add comments
