Features for Textual Data

Goldberg, Yoav

doi:10.1007/978-3-031-02165-7_6

Yoav Goldberg²

Part of the book series: Synthesis Lectures on Human Language Technologies ((SLHLT))

1016 Accesses

Abstract

In the previous chapters we discussed the general learning problem, and saw some machine learning models and algorithms for training them. All of these models take as input vectors x and produce predictions. Up until now we assumed the vectors x are given. In language processing, the vectors x are derived from textual data, in order to reflect various linguistic properties of the text. The mapping from textual data to real valued vectors is called feature extraction or feature representation, and is done by a feature function. Deciding on the right features is an integral part of a successful machine learning project. While deep neural networks alleviate a lot of the need in feature engineering, a good set of core features still needs to be defined. This is especially true for language data, which comes in the form of a sequence of discrete symbols. This sequence needs to be converted somehow to a numerical vector, in a non-obvious way.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Bar Ilan University, Israel
Yoav Goldberg

Authors

Yoav Goldberg
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Goldberg, Y. (2017). Features for Textual Data. In: Neural Network Methods for Natural Language Processing. Synthesis Lectures on Human Language Technologies. Springer, Cham. https://doi.org/10.1007/978-3-031-02165-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-02165-7_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-01037-8
Online ISBN: 978-3-031-02165-7
eBook Packages: Synthesis Collection of Technology (R0)eBColl Synthesis Collection 7

Publish with us

Policies and ethics