The data was collected by the Food and Agriculture Organization of the United Nations (FAO) and includes the years 1961 up to 2021, observing 245 countries and territories. The area code refers to the United Nations M-49 list. One could argue that in our case it is not justified to talk about Big Data, due to the small sizes of the datasets, and while that is true, with more time and more frequent data collection the necessity of Big Data can be easily justified.
Climate Change
The Climate Change indicator used within this project is the Temperature (in Celsius) over time. The important information relevant to our studies can be found in the columns Area Code (M49), Months Code, Elements Code and Year. The data was normalized before usage, which does not reduce the amount of information, as we are only interested in temperature changes.
Crop Production
The data provided by the FAO includes 278 frequent Crops and livestocks products including processed and non-processed products (examples: Cucumber, Coriander, Oil, Whole fresh goat...). Before using the data in the Application it was cleaned: We added the option to input two lists specifying crops and countries which should be considered for the anaylsis. A query drops out all entries which are not part of the considered countries or crops before further processing. Additionally, since many countries did not exist over the whole time at which the data was collected, we filter out any countires which do not have entries for the whole timescale of the dataset. Further many of the columns in the dataset are of no interest to the project. So all columns except those for the country, the kind of crop and the yields and production volumes of each year are being filtered out too.