UID:
kobvindex_HPB1452609884
Format:
1 online resource
ISBN:
9783031657948
,
3031657942
Series Statement:
Lecture notes in artificial intelligence 14770
Content:
This Open Access book constitutes the refereed proceedings of the First International Workshop on Natural Scientific Language Processing and Research Knowledge Graphs, NSLP 2024, held in Hersonissos, Crete, Greece, on May 27, 2024. The 10 full papers and 11 short papers included in this volume were carefully reviewed and selected from a total of 26 submissions. The proceedings aims to bring together researchers working on the processing, analysis, transformation and making use-of scientific language and research knowledge graphs including all relevant sub-topics.
Note:
Intro -- Preface -- Organization -- Contents -- Scholarly Information Processing -- Scholarly Question Answering Using Large Language Models in the NFDI4DataScience Gateway -- 1 Introduction -- 2 Related Works -- 3 Methodological Framework -- 3.1 The Gateway -- Federated Search -- 3.2 Scholarly Question Answering -- 4 Evaluation -- 4.1 Evaluation Dataset -- 4.2 Evaluation Metrics -- 4.3 Results -- 5 Limitations and Future Directions -- 6 Conclusion -- References -- Cite-worthiness Detection on Social Media: A Preliminary Study -- 1 Introduction -- 2 Related Work -- 3 Data -- 4 Experiments
,
4.1 Setting -- 4.2 Results -- 4.3 Discussion -- 5 Limitations -- 6 Conclusion -- References -- Towards a Novel Classification of Table Types in Scholarly Publications -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Data -- 3.2 Taxonomies Construction -- 3.3 Annotation -- 3.4 Models -- 3.5 Evaluation Metrics -- 4 Results -- 4.1 Dataset Analysis -- 4.2 Table Type Classification -- 5 Discussion -- 6 Limitations -- 7 Conclusion -- A Examples of Matrix, Horizontal Listing, and Vertical Listing Tables -- B Illustrations of Table Features -- References
,
OCR Cleaning of Scientific Texts with LLMs -- 1 Introduction -- 2 Motivation -- 3 Related Work -- 4 The Training Data -- 4.1 The ``JIM Corpus'' -- 4.2 The Datamaker Pipeline -- 4.3 Synthetic Data -- 4.4 The Gold Standard Corpus -- 5 The Training Method -- 6 Results and Discussion -- 6.1 Metric Definitions -- 6.2 The SOTA -- 6.3 Performance Improvement -- 7 Conclusion -- References -- Identifying and Leveraging Research Software -- RTaC: A Generalized Framework for Tooling -- 1 Introduction -- 2 Related Works -- 2.1 Dataset and Tooling Benchmarks -- 2.2 Tooling LLMs -- 2.3 Prompting Methods
,
3 Method -- 3.1 RTaC (Reimagining Tooling as Coding) -- 3.2 Evaluation Metrics -- 4 Experiments -- 4.1 Retrievers -- 4.2 Closed Source LLMs -- 4.3 Open Source LLMs -- 5 Results -- 6 Conclusion -- 7 Future Scope -- A Appendix -- A.1 JSON Converter -- A.2 Prompt for Section4.3 -- A.3 Default Tools Used to Generate New Tools -- References -- Scientific Software Citation Intent Classification Using Large Language Models -- 1 Introduction -- 2 Related Work -- 2.1 Research Software Studies -- 2.2 Citation Intent Classification -- 3 Methods -- 3.1 Citation Intent Classes -- 3.2 Data
,
3.3 Training Models -- 4 Results -- 4.1 Results of BERT Models -- 4.2 Results of GPT-3.5/GPT-4 -- 5 Data and Code Availability Statement -- 6 Discussion -- 7 Conclusion -- References -- RepoFromPaper: An Approach to Extract Software Code Implementations from Scientific Publications -- 1 Introduction -- 2 Related Work -- 3 RepoFromPaper: Methodology -- 3.1 PDF-to-Text Conversion -- 3.2 Sentence Extraction -- 3.3 Sentence Classification -- 3.4 Sentence Ranking -- 3.5 Repository Link Search -- 4 Evaluation Methods -- 4.1 Mean Reciprocal Rank (MRR) -- 4.2 Precision, Recall, and F1 Score
Language:
English
Keywords:
Electronic books.
DOI:
10.1007/978-3-031-65794-8
URL:
Click here to view book
Bookmarklink