In Data Engineering, we can classify data based on how structured it is.


Structured Data

Structured Data refers to all data that follows a fixed format and can be organized into rows and columns can be defined as structured. Any Excel or SQL table, for example, is defined as structured.


Semi-structured data

Semi-structured data refers to all data that has consistent characteristics and format, containing metadata for separating fields, but does not follow a tabular structure. JSON, XML , MongoDB, and emails, for example, are semi-structured.


Unstructured Data

Any other type of data that do not conform to the two types above may be classified as unstructured data. This includes audio, image, natural language and geographic data.


References