UID:
almahu_9949880883502882
Umfang:
1 online resource (313 pages)
Ausgabe:
1st ed.
ISBN:
9783031657948
Serie:
Lecture Notes in Computer Science Series ; v.14770
Anmerkung:
Intro -- Preface -- Organization -- Contents -- Scholarly Information Processing -- Scholarly Question Answering Using Large Language Models in the NFDI4DataScience Gateway -- 1 Introduction -- 2 Related Works -- 3 Methodological Framework -- 3.1 The Gateway - Federated Search -- 3.2 Scholarly Question Answering -- 4 Evaluation -- 4.1 Evaluation Dataset -- 4.2 Evaluation Metrics -- 4.3 Results -- 5 Limitations and Future Directions -- 6 Conclusion -- References -- Cite-worthiness Detection on Social Media: A Preliminary Study -- 1 Introduction -- 2 Related Work -- 3 Data -- 4 Experiments -- 4.1 Setting -- 4.2 Results -- 4.3 Discussion -- 5 Limitations -- 6 Conclusion -- References -- Towards a Novel Classification of Table Types in Scholarly Publications -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Data -- 3.2 Taxonomies Construction -- 3.3 Annotation -- 3.4 Models -- 3.5 Evaluation Metrics -- 4 Results -- 4.1 Dataset Analysis -- 4.2 Table Type Classification -- 5 Discussion -- 6 Limitations -- 7 Conclusion -- A Examples of Matrix, Horizontal Listing, and Vertical Listing Tables -- B Illustrations of Table Features -- References -- OCR Cleaning of Scientific Texts with LLMs -- 1 Introduction -- 2 Motivation -- 3 Related Work -- 4 The Training Data -- 4.1 The ``JIM Corpus'' -- 4.2 The Datamaker Pipeline -- 4.3 Synthetic Data -- 4.4 The Gold Standard Corpus -- 5 The Training Method -- 6 Results and Discussion -- 6.1 Metric Definitions -- 6.2 The SOTA -- 6.3 Performance Improvement -- 7 Conclusion -- References -- Identifying and Leveraging Research Software -- RTaC: A Generalized Framework for Tooling -- 1 Introduction -- 2 Related Works -- 2.1 Dataset and Tooling Benchmarks -- 2.2 Tooling LLMs -- 2.3 Prompting Methods -- 3 Method -- 3.1 RTaC (Reimagining Tooling as Coding) -- 3.2 Evaluation Metrics -- 4 Experiments.
,
4.1 Retrievers -- 4.2 Closed Source LLMs -- 4.3 Open Source LLMs -- 5 Results -- 6 Conclusion -- 7 Future Scope -- A Appendix -- A.1 JSON Converter -- A.2 Prompt for Section4.3 -- A.3 Default Tools Used to Generate New Tools -- References -- Scientific Software Citation Intent Classification Using Large Language Models -- 1 Introduction -- 2 Related Work -- 2.1 Research Software Studies -- 2.2 Citation Intent Classification -- 3 Methods -- 3.1 Citation Intent Classes -- 3.2 Data -- 3.3 Training Models -- 4 Results -- 4.1 Results of BERT Models -- 4.2 Results of GPT-3.5/GPT-4 -- 5 Data and Code Availability Statement -- 6 Discussion -- 7 Conclusion -- References -- RepoFromPaper: An Approach to Extract Software Code Implementations from Scientific Publications -- 1 Introduction -- 2 Related Work -- 3 RepoFromPaper: Methodology -- 3.1 PDF-to-Text Conversion -- 3.2 Sentence Extraction -- 3.3 Sentence Classification -- 3.4 Sentence Ranking -- 3.5 Repository Link Search -- 4 Evaluation Methods -- 4.1 Mean Reciprocal Rank (MRR) -- 4.2 Precision, Recall, and F1 Score -- 4.3 Training and Testing Corpora -- 5 Results -- 6 A Corpus of Papers and Their Corresponding Implementations -- 7 Discussion -- 8 Conclusions and Future Work -- References -- Automated Extraction of Research Software Installation Instructions from README Files: An Initial Analysis -- 1 Introduction -- 2 Related Work -- 3 PlanStep: Extracting Installation Instructions from README Files -- 3.1 Classical Planning: Software Installation Instructions -- 3.2 PlanStep Methodology -- 3.3 PlanStep Corpus Creation -- 3.4 Ground Truth Extraction for PlanStep -- 3.5 Distribution of the Installation Instructions of README Files -- 3.6 PlanStep Prompting -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Evaluation Metrics -- 4.3 Evaluation Results -- 4.4 Analysis -- 5 Discussion.
,
6 Conclusion and Future Work -- References -- A Technical/Scientific Document Management Platform -- 1 Introduction -- 2 Related Work -- 3 A Brief Overview of DANKE -- 4 DANKE-U -- 5 Use Case -- 6 Conclusions and Directions for Future Work -- A Appendix -- References -- Research Knowledge Graphs -- The Effect of Knowledge Graph Schema on Classifying Future Research Suggestions -- 1 Introduction -- 1.1 Problem Statement -- 1.2 Research Questions -- 2 Related Work -- 3 Methodology -- 3.1 Schema and Dataset -- 3.2 Graph Classification of Challenges and Directions -- 3.3 Experimental Setup -- 4 Results -- 4.1 Graph Classification -- 5 Discussion -- 5.1 Limitations and Future Research -- 6 Conclusion -- A Tables and Figures -- References -- Assessing the Overlap of Science Knowledge Graphs: A Quantitative Analysis -- 1 Introduction -- 2 A Methodology for Assessing SKG Overlap -- 2.1 Data Preparation -- 2.2 Category Alignment -- 3 Initial SKG Overlap Assessment: OpenAIRE and OpenAlex -- 3.1 Paper-Category Schema Representation -- 3.2 SKG Overlap Analysis: Data Preparation -- 3.3 SKG Overlap Analysis: Category Alignment -- 4 Results -- 5 Related Work -- 5.1 KG Alignment Based on Embeddings -- 5.2 KG Alignment Based on Machine Learning -- 6 Conclusions and Future Work -- References -- Shared Task: FoRC -- FoRC@NSLP2024: Overview and Insights from the Field of Research Classification Shared Task -- 1 Introduction -- 2 Related Work -- 3 Tasks Description -- 4 Shared Task Datasets -- 4.1 Subtask I -- 4.2 Subtask II -- 5 Results -- 5.1 Baselines -- 5.2 Subtask I -- 5.3 Subtask II -- 6 Discussion -- 7 Conclusion -- References -- NRK at FoRC 2024 Subtask I: Exploiting BERT-Based Models for Multi-class Classification of Scholarly Papers -- 1 Introduction -- 2 Related Work -- 3 Approach -- 4 Experimental Setup -- 4.1 Data -- 4.2 Configuration Settings.
,
5 Results and Discussion -- 5.1 Baseline Performance -- 5.2 Results -- 5.3 Error Analysis -- 6 Conclusion and Future Work -- References -- Advancing Automatic Subject Indexing: Combining Weak Supervision with Extreme Multi-label Classification -- 1 Introduction -- 2 Related Work -- 3 Data Analysis -- 4 Experiments -- 5 Results and Discussion -- 6 Conclusion and Future Work -- References -- Single-Label Multi-modal Field of Research Classification -- 1 Introduction -- 2 Related Work -- 3 The SLAMFORC System -- 3.1 Multi-modal Data -- 3.2 Classifier -- 4 Experiments -- 5 Conclusions -- References -- Enriched BERT Embeddings for Scholarly Publication Classification -- 1 Introduction and Background -- 2 Dataset -- 3 Enrichment -- 4 Approaches -- 4.1 BERT-Embeddings -- 4.2 Combined BERT-Embeddings (TwinBERT) -- 5 Results -- 6 Discussion and Conclusion -- References -- Shared Task: SOMD -- SOMD@NSLP2024: Overview and Insights from the Software Mention Detection Shared Task -- 1 Introduction -- 2 Related Work -- 3 Tasks Description -- 4 Dataset -- 5 Results -- 5.1 Subtask I -- 5.2 Subtask II -- 5.3 Subtask III -- 6 Conclusion -- References -- Software Mention Recognition with a Three-Stage Framework Based on BERTology Models at SOMD 2024 -- 1 Introduction -- 2 Related Work -- 3 Approach -- 3.1 Approach 1: Token Classification with BERTs -- 3.2 Approach 2: Two-Stage Framework for Entity Extraction and Classification -- 3.3 Approach 3: Three-Stage Framework -- 4 Experimental Setup -- 4.1 Data and Evaluation Metrics -- 4.2 System Settings -- 5 Main Results -- 6 Conclusion and Future Work -- References -- ABCD Team at SOMD 2024: Software Mention Detection in Scholarly Publications with Large Language Models -- 1 Introduction -- 2 Related Work -- 3 Approach -- 3.1 Overview -- 3.2 Low-Rank Adaptation -- 4 Experimental Setup -- 4.1 Dataset and Evaluation Metrics.
,
4.2 System Settings -- 5 Main Results -- 6 Discussion -- 6.1 Challenges in Applying LLMs to NER Tasks -- 6.2 Conclusion and Future Work -- References -- Falcon 7b for Software Mention Detection in Scholarly Documents -- 1 Introduction -- 2 Related Work -- 2.1 Rule-Based and Classical Machine Learning Approaches -- 2.2 Deep Learning-Based Approaches -- 2.3 Large Language Model-Based Approaches -- 3 Method -- 4 Experimental Results -- 4.1 Results -- 5 Conclusion -- References -- Enhancing Software-Related Information Extraction via Single-Choice Question Answering with Large Language Models -- 1 Introduction -- 2 Related Work -- 3 SOMD Shared Task -- 4 Using LLMs for Software Related IE-Tasks -- 4.1 Challenges in Applying LLMs to NER Tasks -- 4.2 Sample Retrieval for RAG on Various IE-Tasks -- 4.3 Extraction of Software Entities -- 4.4 Extraction of Software Attributes -- 4.5 Relation Extraction as Single-Choice Question Answering Task -- 5 Experiments -- 5.1 Models -- 5.2 Prompting -- 5.3 Train Sample Retrieval for Few-Shot Generation -- 5.4 Relation Extraction Baseline -- 6 Results -- 7 Conclusion -- A Dataset Overview -- B Similarity Search Examples -- B.1 Search by Entity Similarity -- B.2 Search by Sentence Similarity -- C Prompting Examples -- References -- Author Index.
Weitere Ausg.:
Print version: Rehm, Georg Natural Scientific Language Processing and Research Knowledge Graphs Cham : Springer,c2024 ISBN 9783031657931
Sprache:
Englisch
Schlagwort(e):
Electronic books.