Semantical approach of assuring high data quality by applying data mining techniques
Sasquatch
Sasquatch is a holistic approach for assuring high data quality in data management systems by applying a conceptual view on the data and using machine learning algorithms for deciding on the correctness of the data.
Sasquatch ("semantical approach of assuring high data quality by applying data mining techniques") is a holistic approach of assuring high data quality in enterprises' data management systems. It utilizes two main ideas for its goal: On the one hand, the given data schemas of one or more data management systems are mapped to the concepts and properties of an either generic or preferably a domain specific ontology. This mapping allows to abstract from data management specific characteristics like normalization in relational database management systems. The mapping also has another advantage. As on the other hand data mining techniques are used to acquire the data's characteristics to decide whether or not a data tuple is correct, the conceptual view increases the "semantical density" of the data so that the machine learning algorithms used perform optimal.
Python
Jython
RDF
OWL
SPARQL
D2RQ
SESAME
Fabian Gruening
84d160207621f3d0321e5e3e4f15d3e78a22a941
Fabian Gruening
84d160207621f3d0321e5e3e4f15d3e78a22a941