-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Welcome to the DuBio wiki!
DuBio is an extension for PostgreSQL for managing and manipulating uncertain data, or to use a more technical term, probabilistic data. Being able to manage data and the uncertainty about the data is an effective way to get a grip on and effectively dealing with data quality issues. One prominent purpose is data integration. Probabilistic data integration (PDI) is a specific kind of data integration where integration problems such as inconsistency and uncertainty are handled by means of a probabilistic data representation. The approach is based on the view that data quality problems (as they occur in an integration process) can be modeled as uncertainty and this uncertainty is considered an important result of the integration process. The PDI process contains two phases: (i) a quick partial integration where certain data quality problems are not solved immediately, but explicitly represented as uncertainty in the resulting integrated data stored in a probabilistic database such as DuBio; (ii) continuous improvement by using the data — a probabilistic database can be queried directly resulting in possible or approximate answers — and gathering evidence (e.g., user feedback) for improving the data quality.
A good place to start learning about what a probabilistic database is and what it can be useful for is the book chapter Probabilistic Data Integration published in the Springer "Encyclopedia of Big Data Technologies". Besides being an introduction, it also contains references to scientific publications on DuBio and probabilistic databases, in general.
- Installation instructions
- What does probabilistic data look like in DuBio
- Querying probabilistic data with DuBio
- Conditioning: Updating information on uncertainty in DuBio
- The dictionary data type
- The pgbdd data type for probabilistic sentences
| Menu |
|---|
| Getting started |
| Installation instructions |
| User Manual |
| What does probabilistic data look like in DuBio |
| Querying probabilistic data with DuBio |
| Conditioning: Updating information on uncertainty in DuBio |
| Reference Manual |
| The dictionary data type |
| The pgbdd data type for probabilistic sentences |