intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Lecture Natural language processing: Chapter 4 – Lê Ngọc Tấn

Chia sẻ: Diên Vu | Ngày: | Loại File: PDF | Số trang:15

17
lượt xem
0
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Lecture “Natural language processing – Chapter 4: Computational linguistics” has contents: What is computational linguistics, corpus definitions, corpus categories, parallel corpora application, alignment methods, normalization, lemmatization and tokenization.

Chủ đề:
Lưu

Nội dung Text: Lecture Natural language processing: Chapter 4 – Lê Ngọc Tấn

Trường Đại học Công nghiệp Tp. HCM<br /> Khoa Công nghệ thông tin<br /> (Faculty of Information Technology)<br /> <br /> N.L.P.<br /> NATURAL LANGUAGE PROCESSING<br /> Teacher: Lê Ngọc Tấn<br />  Email: letan.dhcn@gmail.com<br />  Blog: http://lengoctan.wordpress.com<br /> <br /> <br /> Chapter 4<br /> Computational Linguistics<br /> <br /> NLP. p.2<br /> <br /> What is computational linguistics?<br /> <br /> <br /> It is an interdisciplinary field dealing with the statistical<br /> or rule-based modeling of natural language from a<br /> computational perspective<br /> <br /> <br /> <br /> Corpus, Corpora<br /> <br /> <br /> <br /> Pre-processing : normalization, tokenization,…<br /> <br /> <br /> <br /> Alignment Methods<br /> <br /> <br /> <br /> Programming<br /> NLP. p.3<br /> <br /> Corpus Definitions<br /> <br /> <br /> What is a corpus?<br /> – It contains an important number of texts<br /> – Corpora : a set of corpus<br /> <br /> <br /> <br /> Golden corpus<br /> – Brown Corpus<br /> – Susanne Corpus<br /> – EUROPARL Corpus<br /> <br /> <br /> <br /> Corpus can be annotated or POS tagged<br /> <br /> NLP. p.4<br /> <br /> Corpus Categories (1)<br /> <br /> <br /> Schema of corpus evolution<br /> <br /> NLP. p.5<br /> <br />
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2