#################### Datasets for CoreNLP #################### This section details the various datasets available for CoreNLP tasks in ASEAN languages. We have grouped them by task and we also provide links to the relevant repositories where available. ********************** Part-of-speech Tagging ********************** .. csv-table:: :file: tables/pos-datasets.csv :header-rows: 1 ************************ Named Entity Recognition ************************ .. csv-table:: :file: tables/ner-datasets.csv :header-rows: 1 ******************** Constituency Parsing ******************** .. csv-table:: :file: tables/cp-datasets.csv :header-rows: 1 .. note:: There are no Thai constituency treebanks (that we are aware of). As the Thai language is more amenable to analysis via dependency grammar, only dependency treebanks are available at the moment. Shallow parsing/chunking is available in many of the open-source Thai datasets if that is of interest (e.g. LST20, ThaiNER). ****************** Dependency Parsing ****************** .. csv-table:: :file: tables/dp-datasets.csv :header-rows: 1 ********************** Coreference Resolution ********************** .. csv-table:: :file: tables/coref-datasets.csv :header-rows: 1 .. note:: D = Document // P = Paragraph // S = Sentence