Downloads & Free Reading Options - Results

Probabilistic Databases by Dan Suciu

Read "Probabilistic Databases" by Dan Suciu through these free online access and download options.

Search for Downloads

Search by Title or Author

Books Results

Source: The Internet Archive

The internet Archive Search Results

Available books for downloads and borrow from The internet Archive

1SkyQuery: An Implementation Of A Parallel Probabilistic Join Engine For Cross-Identification Of Multiple Astronomical Databases

“SkyQuery: An Implementation Of A Parallel Probabilistic Join Engine For Cross-Identification Of Multiple Astronomical Databases” Metadata:

  • Title: ➤  SkyQuery: An Implementation Of A Parallel Probabilistic Join Engine For Cross-Identification Of Multiple Astronomical Databases

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 9.32 Mbs, the file-s for this book were downloaded 67 times, the file-s went public at Fri Sep 20 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find SkyQuery: An Implementation Of A Parallel Probabilistic Join Engine For Cross-Identification Of Multiple Astronomical Databases at online marketplaces:


2DTIC ADA464777: Multi-Document Relationship Fusion Via Constraints On Probabilistic Databases

By

Previous multi-document relationship extraction and fusion research has focused on single relationships. Shifting the focus to multiple relationships allows for the use of mutual constraints to aid extraction. This paper presents a fusion method which uses a probabilistic database model to pick relationships which violate few constraints. This model allows improved performance on constructing corporate succession timelines from multiple documents with respect to a multi-document fusion baseline.

“DTIC ADA464777: Multi-Document Relationship Fusion Via Constraints On Probabilistic Databases” Metadata:

  • Title: ➤  DTIC ADA464777: Multi-Document Relationship Fusion Via Constraints On Probabilistic Databases
  • Author: ➤  
  • Language: English

“DTIC ADA464777: Multi-Document Relationship Fusion Via Constraints On Probabilistic Databases” Subjects and Themes:

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 8.29 Mbs, the file-s for this book were downloaded 57 times, the file-s went public at Sat Jun 09 2018.

Available formats:
Abbyy GZ - Archive BitTorrent - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - OCR Page Index - OCR Search Text - Page Numbers JSON - Scandata - Single Page Processed JP2 ZIP - Text PDF - chOCR - hOCR -

Related Links:

Online Marketplaces

Find DTIC ADA464777: Multi-Document Relationship Fusion Via Constraints On Probabilistic Databases at online marketplaces:


3Microsoft Research Audio 104016: Managing Uncertainty Using Probabilistic Databases

By

Uncertainty is a fundamental problem underlying several modern database applications: exploratory queries in databases, data integration, querying information extracted from the Web, queries over sensor networks, scientific data management, reasoning about privacy breaches in data mining and many others. In this talk, I will describe probabilistic databases as a unifying framework to manage the various kinds of uncertainties that arise in these wide range of applications. In a probabilistic database, each data item has a probability of belonging to the database and queries return answers that are ranked by probabilities. The main challenge here is query evaluation. Unlike in traditional databases, some queries have a #P-complete complexity. I will present the results of our study of the complexity of queries and present algorithms and techniques for efficient query evaluation over probabilistic databases. ©2007 Microsoft Corporation. All rights reserved.

“Microsoft Research Audio 104016: Managing Uncertainty Using Probabilistic Databases” Metadata:

  • Title: ➤  Microsoft Research Audio 104016: Managing Uncertainty Using Probabilistic Databases
  • Author:
  • Language: English

“Microsoft Research Audio 104016: Managing Uncertainty Using Probabilistic Databases” Subjects and Themes:

Edition Identifiers:

Downloads Information:

The book is available for download in "audio" format, the size of the file-s is: 40.59 Mbs, the file-s for this book were downloaded 4 times, the file-s went public at Sat Nov 23 2013.

Available formats:
Archive BitTorrent - Item Tile - Metadata - Ogg Vorbis - PNG - VBR MP3 -

Related Links:

Online Marketplaces

Find Microsoft Research Audio 104016: Managing Uncertainty Using Probabilistic Databases at online marketplaces:


4Microsoft Research Video 103486: Managing Large-scale Probabilistic Databases

By

For the next generation of data-management applications, such as sensor-based monitoring, data integration, and information extraction, data processing is the dominant cost. Often, the data driving these applications are uncertain, for example, due to missed, inconsistent, or imprecise sensor readings. Unfortunately, traditional data-management systems provide little or no support for managing uncertainty. To remedy this, my dissertation advocates an approach for data management in which uncertainty is modeled using probabilities. The cost of modeling imprecision using probabilities is that basic data-management tasks, such as querying, become theoretically and practically more difficult. Thus, the key challenge in managing large-scale probabilistic data is efficiency. In this talk, I will discuss the fundamental techniques that I developed in my dissertation to build a probabilistic database capable of handling large, imprecise datasets: these techniques include top-k processing with probabilities, materialized views, approximate lineage, and extensional processing for complex analytic queries. This work resulted in two systems: Mystiq, the first system to support complex queries on gigabytes of probabilistic relational data, and Lahar, the first system to support rich event-style queries on large, probabilistic streams. ©2009 Microsoft Corporation. All rights reserved.

“Microsoft Research Video 103486: Managing Large-scale Probabilistic Databases” Metadata:

  • Title: ➤  Microsoft Research Video 103486: Managing Large-scale Probabilistic Databases
  • Author:
  • Language: English

“Microsoft Research Video 103486: Managing Large-scale Probabilistic Databases” Subjects and Themes:

Edition Identifiers:

Downloads Information:

The book is available for download in "movies" format, the size of the file-s is: 913.56 Mbs, the file-s for this book were downloaded 49 times, the file-s went public at Mon Feb 10 2014.

Available formats:
Animated GIF - Archive BitTorrent - Item Tile - Metadata - Ogg Video - Thumbnail - Windows Media - h.264 -

Related Links:

Online Marketplaces

Find Microsoft Research Video 103486: Managing Large-scale Probabilistic Databases at online marketplaces:


5Semantics And Evaluation Of Top-k Queries In Probabilistic Databases

By

We study here fundamental issues involved in top-k query evaluation in probabilistic databases. We consider simple probabilistic databases in which probabilities are associated with individual tuples, and general probabilistic databases in which, additionally, exclusivity relationships between tuples can be represented. In contrast to other recent research in this area, we do not limit ourselves to injective scoring functions. We formulate three intuitive postulates that the semantics of top-k queries in probabilistic databases should satisfy, and introduce a new semantics, Global-Topk, that satisfies those postulates to a large degree. We also show how to evaluate queries under the Global-Topk semantics. For simple databases we design dynamic-programming based algorithms, and for general databases we show polynomial-time reductions to the simple cases. For example, we demonstrate that for a fixed k the time complexity of top-k query evaluation is as low as linear, under the assumption that probabilistic databases are simple and scoring functions are injective.

“Semantics And Evaluation Of Top-k Queries In Probabilistic Databases” Metadata:

  • Title: ➤  Semantics And Evaluation Of Top-k Queries In Probabilistic Databases
  • Authors:
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 25.75 Mbs, the file-s for this book were downloaded 64 times, the file-s went public at Sun Sep 22 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Semantics And Evaluation Of Top-k Queries In Probabilistic Databases at online marketplaces:


6Conditioning Probabilistic Databases

By

Past research on probabilistic databases has studied the problem of answering queries on a static database. Application scenarios of probabilistic databases however often involve the conditioning of a database using additional information in the form of new evidence. The conditioning problem is thus to transform a probabilistic database of priors into a posterior probabilistic database which is materialized for subsequent query processing or further refinement. It turns out that the conditioning problem is closely related to the problem of computing exact tuple confidence values. It is known that exact confidence computation is an NP-hard problem. This has led researchers to consider approximation techniques for confidence computation. However, neither conditioning nor exact confidence computation can be solved using such techniques. In this paper we present efficient techniques for both problems. We study several problem decomposition methods and heuristics that are based on the most successful search techniques from constraint satisfaction, such as the Davis-Putnam algorithm. We complement this with a thorough experimental evaluation of the algorithms proposed. Our experiments show that our exact algorithms scale well to realistic database sizes and can in some scenarios compete with the most efficient previous approximation algorithms.

“Conditioning Probabilistic Databases” Metadata:

  • Title: ➤  Conditioning Probabilistic Databases
  • Authors:

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 11.54 Mbs, the file-s for this book were downloaded 82 times, the file-s went public at Wed Sep 18 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Conditioning Probabilistic Databases at online marketplaces:


7Microsoft Research Audio 103486: Managing Large-scale Probabilistic Databases

By

For the next generation of data-management applications, such as sensor-based monitoring, data integration, and information extraction, data processing is the dominant cost. Often, the data driving these applications are uncertain, for example, due to missed, inconsistent, or imprecise sensor readings. Unfortunately, traditional data-management systems provide little or no support for managing uncertainty. To remedy this, my dissertation advocates an approach for data management in which uncertainty is modeled using probabilities. The cost of modeling imprecision using probabilities is that basic data-management tasks, such as querying, become theoretically and practically more difficult. Thus, the key challenge in managing large-scale probabilistic data is efficiency. In this talk, I will discuss the fundamental techniques that I developed in my dissertation to build a probabilistic database capable of handling large, imprecise datasets: these techniques include top-k processing with probabilities, materialized views, approximate lineage, and extensional processing for complex analytic queries. This work resulted in two systems: Mystiq, the first system to support complex queries on gigabytes of probabilistic relational data, and Lahar, the first system to support rich event-style queries on large, probabilistic streams.

“Microsoft Research Audio 103486: Managing Large-scale Probabilistic Databases” Metadata:

  • Title: ➤  Microsoft Research Audio 103486: Managing Large-scale Probabilistic Databases
  • Author:
  • Language: English

“Microsoft Research Audio 103486: Managing Large-scale Probabilistic Databases” Subjects and Themes:

Edition Identifiers:

Downloads Information:

The book is available for download in "audio" format, the size of the file-s is: 53.05 Mbs, the file-s for this book were downloaded 5 times, the file-s went public at Sat Nov 23 2013.

Available formats:
Archive BitTorrent - Columbia Peaks - Essentia High GZ - Essentia Low GZ - Item Tile - Metadata - Ogg Vorbis - PNG - Spectrogram - VBR MP3 -

Related Links:

Online Marketplaces

Find Microsoft Research Audio 103486: Managing Large-scale Probabilistic Databases at online marketplaces:


8Probabilistic Databases

By

For the next generation of data-management applications, such as sensor-based monitoring, data integration, and information extraction, data processing is the dominant cost. Often, the data driving these applications are uncertain, for example, due to missed, inconsistent, or imprecise sensor readings. Unfortunately, traditional data-management systems provide little or no support for managing uncertainty. To remedy this, my dissertation advocates an approach for data management in which uncertainty is modeled using probabilities. The cost of modeling imprecision using probabilities is that basic data-management tasks, such as querying, become theoretically and practically more difficult. Thus, the key challenge in managing large-scale probabilistic data is efficiency. In this talk, I will discuss the fundamental techniques that I developed in my dissertation to build a probabilistic database capable of handling large, imprecise datasets: these techniques include top-k processing with probabilities, materialized views, approximate lineage, and extensional processing for complex analytic queries. This work resulted in two systems: Mystiq, the first system to support complex queries on gigabytes of probabilistic relational data, and Lahar, the first system to support rich event-style queries on large, probabilistic streams.

“Probabilistic Databases” Metadata:

  • Title: Probabilistic Databases
  • Author: ➤  
  • Language: English

“Probabilistic Databases” Subjects and Themes:

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 398.14 Mbs, the file-s for this book were downloaded 21 times, the file-s went public at Fri Jul 21 2023.

Available formats:
ACS Encrypted PDF - Cloth Cover Detection Log - DjVuTXT - Djvu XML - Dublin Core - Item Tile - JPEG Thumb - JSON - LCP Encrypted EPUB - LCP Encrypted PDF - Log - MARC - MARC Binary - Metadata - OCR Page Index - OCR Search Text - PNG - Page Numbers JSON - RePublisher Final Processing Log - RePublisher Initial Processing Log - Scandata - Single Page Original JP2 Tar - Single Page Processed JP2 ZIP - Text PDF - Title Page Detection Log - chOCR - hOCR -

Related Links:

Online Marketplaces

Find Probabilistic Databases at online marketplaces:


9Explicit Probabilistic Models For Databases And Networks

By

Recent work in data mining and related areas has highlighted the importance of the statistical assessment of data mining results. Crucial to this endeavour is the choice of a non-trivial null model for the data, to which the found patterns can be contrasted. The most influential null models proposed so far are defined in terms of invariants of the null distribution. Such null models can be used by computation intensive randomization approaches in estimating the statistical significance of data mining results. Here, we introduce a methodology to construct non-trivial probabilistic models based on the maximum entropy (MaxEnt) principle. We show how MaxEnt models allow for the natural incorporation of prior information. Furthermore, they satisfy a number of desirable properties of previously introduced randomization approaches. Lastly, they also have the benefit that they can be represented explicitly. We argue that our approach can be used for a variety of data types. However, for concreteness, we have chosen to demonstrate it in particular for databases and networks.

“Explicit Probabilistic Models For Databases And Networks” Metadata:

  • Title: ➤  Explicit Probabilistic Models For Databases And Networks
  • Author:
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 10.29 Mbs, the file-s for this book were downloaded 72 times, the file-s went public at Sun Sep 22 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Explicit Probabilistic Models For Databases And Networks at online marketplaces:


10Efficient Subgraph Similarity Search On Large Probabilistic Graph Databases

By

Many studies have been conducted on seeking the efficient solution for subgraph similarity search over certain (deterministic) graphs due to its wide application in many fields, including bioinformatics, social network analysis, and Resource Description Framework (RDF) data management. All these works assume that the underlying data are certain. However, in reality, graphs are often noisy and uncertain due to various factors, such as errors in data extraction, inconsistencies in data integration, and privacy preserving purposes. Therefore, in this paper, we study subgraph similarity search on large probabilistic graph databases. Different from previous works assuming that edges in an uncertain graph are independent of each other, we study the uncertain graphs where edges' occurrences are correlated. We formally prove that subgraph similarity search over probabilistic graphs is #P-complete, thus, we employ a filter-and-verify framework to speed up the search. In the filtering phase,we develop tight lower and upper bounds of subgraph similarity probability based on a probabilistic matrix index, PMI. PMI is composed of discriminative subgraph features associated with tight lower and upper bounds of subgraph isomorphism probability. Based on PMI, we can sort out a large number of probabilistic graphs and maximize the pruning capability. During the verification phase, we develop an efficient sampling algorithm to validate the remaining candidates. The efficiency of our proposed solutions has been verified through extensive experiments.

“Efficient Subgraph Similarity Search On Large Probabilistic Graph Databases” Metadata:

  • Title: ➤  Efficient Subgraph Similarity Search On Large Probabilistic Graph Databases
  • Authors:
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 12.15 Mbs, the file-s for this book were downloaded 97 times, the file-s went public at Fri Sep 20 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Efficient Subgraph Similarity Search On Large Probabilistic Graph Databases at online marketplaces:


11Approximate Lifted Inference With Probabilistic Databases

By

This paper proposes a new approach for approximate evaluation of #P-hard queries with probabilistic databases. In our approach, every query is evaluated entirely in the database engine by evaluating a fixed number of query plans, each providing an upper bound on the true probability, then taking their minimum. We provide an algorithm that takes into account important schema information to enumerate only the minimal necessary plans among all possible plans. Importantly, this algorithm is a strict generalization of all known results of PTIME self-join-free conjunctive queries: A query is safe if and only if our algorithm returns one single plan. We also apply three relational query optimization techniques to evaluate all minimal safe plans very fast. We give a detailed experimental evaluation of our approach and, in the process, provide a new way of thinking about the value of probabilistic methods over non-probabilistic methods for ranking query answers.

“Approximate Lifted Inference With Probabilistic Databases” Metadata:

  • Title: ➤  Approximate Lifted Inference With Probabilistic Databases
  • Authors:

“Approximate Lifted Inference With Probabilistic Databases” Subjects and Themes:

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 1.68 Mbs, the file-s for this book were downloaded 28 times, the file-s went public at Sat Jun 30 2018.

Available formats:
Archive BitTorrent - Metadata - Text PDF -

Related Links:

Online Marketplaces

Find Approximate Lifted Inference With Probabilistic Databases at online marketplaces:


12Microsoft Research Video 104016: Managing Uncertainty Using Probabilistic Databases

By

Uncertainty is a fundamental problem underlying several modern database applications: exploratory queries in databases, data integration, querying information extracted from the Web, queries over sensor networks, scientific data management, reasoning about privacy breaches in data mining and many others. In this talk, I will describe probabilistic databases as a unifying framework to manage the various kinds of uncertainties that arise in these wide range of applications. In a probabilistic database, each data item has a probability of belonging to the database and queries return answers that are ranked by probabilities. The main challenge here is query evaluation. Unlike in traditional databases, some queries have a #P-complete complexity. I will present the results of our study of the complexity of queries and present algorithms and techniques for efficient query evaluation over probabilistic databases. ©2007 Microsoft Corporation. All rights reserved.

“Microsoft Research Video 104016: Managing Uncertainty Using Probabilistic Databases” Metadata:

  • Title: ➤  Microsoft Research Video 104016: Managing Uncertainty Using Probabilistic Databases
  • Author:
  • Language: English

“Microsoft Research Video 104016: Managing Uncertainty Using Probabilistic Databases” Subjects and Themes:

Edition Identifiers:

Downloads Information:

The book is available for download in "movies" format, the size of the file-s is: 598.84 Mbs, the file-s for this book were downloaded 60 times, the file-s went public at Wed Apr 30 2014.

Available formats:
Animated GIF - Archive BitTorrent - Item Tile - Metadata - Ogg Video - Thumbnail - Windows Media - h.264 -

Related Links:

Online Marketplaces

Find Microsoft Research Video 104016: Managing Uncertainty Using Probabilistic Databases at online marketplaces:


13Consensus Answers For Queries Over Probabilistic Databases

By

We address the problem of finding a "best" deterministic query answer to a query over a probabilistic database. For this purpose, we propose the notion of a consensus world (or a consensus answer) which is a deterministic world (answer) that minimizes the expected distance to the possible worlds (answers). This problem can be seen as a generalization of the well-studied inconsistent information aggregation problems (e.g. rank aggregation) to probabilistic databases. We consider this problem for various types of queries including SPJ queries, \Topk queries, group-by aggregate queries, and clustering. For different distance metrics, we obtain polynomial time optimal or approximation algorithms for computing the consensus answers (or prove NP-hardness). Most of our results are for a general probabilistic database model, called {\em and/xor tree model}, which significantly generalizes previous probabilistic database models like x-tuples and block-independent disjoint models, and is of independent interest.

“Consensus Answers For Queries Over Probabilistic Databases” Metadata:

  • Title: ➤  Consensus Answers For Queries Over Probabilistic Databases
  • Authors:

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 8.94 Mbs, the file-s for this book were downloaded 72 times, the file-s went public at Sun Sep 22 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Consensus Answers For Queries Over Probabilistic Databases at online marketplaces:


14Probabilistic Frequent Pattern Growth For Itemset Mining In Uncertain Databases (Technical Report)

By

Frequent itemset mining in uncertain transaction databases semantically and computationally differs from traditional techniques applied on standard (certain) transaction databases. Uncertain transaction databases consist of sets of existentially uncertain items. The uncertainty of items in transactions makes traditional techniques inapplicable. In this paper, we tackle the problem of finding probabilistic frequent itemsets based on possible world semantics. In this context, an itemset X is called frequent if the probability that X occurs in at least minSup transactions is above a given threshold. We make the following contributions: We propose the first probabilistic FP-Growth algorithm (ProFP-Growth) and associated probabilistic FP-Tree (ProFP-Tree), which we use to mine all probabilistic frequent itemsets in uncertain transaction databases without candidate generation. In addition, we propose an efficient technique to compute the support probability distribution of an itemset in linear time using the concept of generating functions. An extensive experimental section evaluates the our proposed techniques and shows that our ProFP-Growth approach is significantly faster than the current state-of-the-art algorithm.

“Probabilistic Frequent Pattern Growth For Itemset Mining In Uncertain Databases (Technical Report)” Metadata:

  • Title: ➤  Probabilistic Frequent Pattern Growth For Itemset Mining In Uncertain Databases (Technical Report)
  • Authors:
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 10.31 Mbs, the file-s for this book were downloaded 77 times, the file-s went public at Sat Sep 21 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Probabilistic Frequent Pattern Growth For Itemset Mining In Uncertain Databases (Technical Report) at online marketplaces:


15Probabilistic Ranking Techniques In Relational Databases

By

Frequent itemset mining in uncertain transaction databases semantically and computationally differs from traditional techniques applied on standard (certain) transaction databases. Uncertain transaction databases consist of sets of existentially uncertain items. The uncertainty of items in transactions makes traditional techniques inapplicable. In this paper, we tackle the problem of finding probabilistic frequent itemsets based on possible world semantics. In this context, an itemset X is called frequent if the probability that X occurs in at least minSup transactions is above a given threshold. We make the following contributions: We propose the first probabilistic FP-Growth algorithm (ProFP-Growth) and associated probabilistic FP-Tree (ProFP-Tree), which we use to mine all probabilistic frequent itemsets in uncertain transaction databases without candidate generation. In addition, we propose an efficient technique to compute the support probability distribution of an itemset in linear time using the concept of generating functions. An extensive experimental section evaluates the our proposed techniques and shows that our ProFP-Growth approach is significantly faster than the current state-of-the-art algorithm.

“Probabilistic Ranking Techniques In Relational Databases” Metadata:

  • Title: ➤  Probabilistic Ranking Techniques In Relational Databases
  • Author:
  • Language: English

“Probabilistic Ranking Techniques In Relational Databases” Subjects and Themes:

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 125.56 Mbs, the file-s for this book were downloaded 18 times, the file-s went public at Tue Jul 25 2023.

Available formats:
ACS Encrypted PDF - Cloth Cover Detection Log - DjVuTXT - Djvu XML - Dublin Core - Item Tile - JPEG Thumb - JSON - LCP Encrypted EPUB - LCP Encrypted PDF - Log - MARC - MARC Binary - Metadata - OCR Page Index - OCR Search Text - PNG - Page Numbers JSON - RePublisher Final Processing Log - RePublisher Initial Processing Log - Scandata - Single Page Original JP2 Tar - Single Page Processed JP2 ZIP - Text PDF - Title Page Detection Log - chOCR - hOCR -

Related Links:

Online Marketplaces

Find Probabilistic Ranking Techniques In Relational Databases at online marketplaces:


16Faster Query Answering In Probabilistic Databases Using Read-Once Functions

By

A boolean expression is in read-once form if each of its variables appears exactly once. When the variables denote independent events in a probability space, the probability of the event denoted by the whole expression in read-once form can be computed in polynomial time (whereas the general problem for arbitrary expressions is #P-complete). Known approaches to checking read-once property seem to require putting these expressions in disjunctive normal form. In this paper, we tell a better story for a large subclass of boolean event expressions: those that are generated by conjunctive queries without self-joins and on tuple-independent probabilistic databases. We first show that given a tuple-independent representation and the provenance graph of an SPJ query plan without self-joins, we can, without using the DNF of a result event expression, efficiently compute its co-occurrence graph. From this, the read-once form can already, if it exists, be computed efficiently using existing techniques. Our second and key contribution is a complete, efficient, and simple to implement algorithm for computing the read-once forms (whenever they exist) directly, using a new concept, that of co-table graph, which can be significantly smaller than the co-occurrence graph.

“Faster Query Answering In Probabilistic Databases Using Read-Once Functions” Metadata:

  • Title: ➤  Faster Query Answering In Probabilistic Databases Using Read-Once Functions
  • Authors:
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 24.19 Mbs, the file-s for this book were downloaded 76 times, the file-s went public at Mon Sep 23 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Faster Query Answering In Probabilistic Databases Using Read-Once Functions at online marketplaces:


17A Unified Approach To Ranking In Probabilistic Databases

By

The dramatic growth in the number of application domains that naturally generate probabilistic, uncertain data has resulted in a need for efficiently supporting complex querying and decision-making over such data. In this paper, we present a unified approach to ranking and top-k query processing in probabilistic databases by viewing it as a multi-criteria optimization problem, and by deriving a set of features that capture the key properties of a probabilistic dataset that dictate the ranked result. We contend that a single, specific ranking function may not suffice for probabilistic databases, and we instead propose two parameterized ranking functions, called PRF-w and PRF-e, that generalize or can approximate many of the previously proposed ranking functions. We present novel generating functions-based algorithms for efficiently ranking large datasets according to these ranking functions, even if the datasets exhibit complex correlations modeled using probabilistic and/xor trees or Markov networks. We further propose that the parameters of the ranking function be learned from user preferences, and we develop an approach to learn those parameters. Finally, we present a comprehensive experimental study that illustrates the effectiveness of our parameterized ranking functions, especially PRF-e, at approximating other ranking functions and the scalability of our proposed algorithms for exact or approximate ranking.

“A Unified Approach To Ranking In Probabilistic Databases” Metadata:

  • Title: ➤  A Unified Approach To Ranking In Probabilistic Databases
  • Authors:
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 15.06 Mbs, the file-s for this book were downloaded 86 times, the file-s went public at Mon Sep 23 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find A Unified Approach To Ranking In Probabilistic Databases at online marketplaces:


18Detecting Dependencies In Sparse, Multivariate Databases Using Probabilistic Programming And Non-parametric Bayes

By

Datasets with hundreds of variables and many missing values are commonplace. In this setting, it is both statistically and computationally challenging to detect true predictive relationships between variables and also to suppress false positives. This paper proposes an approach that combines probabilistic programming, information theory, and non-parametric Bayes. It shows how to use Bayesian non-parametric modeling to (i) build an ensemble of joint probability models for all the variables; (ii) efficiently detect marginal independencies; and (iii) estimate the conditional mutual information between arbitrary subsets of variables, subject to a broad class of constraints. Users can access these capabilities using BayesDB, a probabilistic programming platform for probabilistic data analysis, by writing queries in a simple, SQL-like language. This paper demonstrates empirically that the method can (i) detect context-specific (in)dependencies on challenging synthetic problems and (ii) yield improved sensitivity and specificity over baselines from statistics and machine learning, on a real-world database of over 300 sparsely observed indicators of macroeconomic development and public health.

“Detecting Dependencies In Sparse, Multivariate Databases Using Probabilistic Programming And Non-parametric Bayes” Metadata:

  • Title: ➤  Detecting Dependencies In Sparse, Multivariate Databases Using Probabilistic Programming And Non-parametric Bayes
  • Authors:

“Detecting Dependencies In Sparse, Multivariate Databases Using Probabilistic Programming And Non-parametric Bayes” Subjects and Themes:

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 1.71 Mbs, the file-s for this book were downloaded 18 times, the file-s went public at Fri Jun 29 2018.

Available formats:
Archive BitTorrent - Metadata - Text PDF -

Related Links:

Online Marketplaces

Find Detecting Dependencies In Sparse, Multivariate Databases Using Probabilistic Programming And Non-parametric Bayes at online marketplaces:


19Scalable Probabilistic Similarity Ranking In Uncertain Databases (Technical Report)

Datasets with hundreds of variables and many missing values are commonplace. In this setting, it is both statistically and computationally challenging to detect true predictive relationships between variables and also to suppress false positives. This paper proposes an approach that combines probabilistic programming, information theory, and non-parametric Bayes. It shows how to use Bayesian non-parametric modeling to (i) build an ensemble of joint probability models for all the variables; (ii) efficiently detect marginal independencies; and (iii) estimate the conditional mutual information between arbitrary subsets of variables, subject to a broad class of constraints. Users can access these capabilities using BayesDB, a probabilistic programming platform for probabilistic data analysis, by writing queries in a simple, SQL-like language. This paper demonstrates empirically that the method can (i) detect context-specific (in)dependencies on challenging synthetic problems and (ii) yield improved sensitivity and specificity over baselines from statistics and machine learning, on a real-world database of over 300 sparsely observed indicators of macroeconomic development and public health.

“Scalable Probabilistic Similarity Ranking In Uncertain Databases (Technical Report)” Metadata:

  • Title: ➤  Scalable Probabilistic Similarity Ranking In Uncertain Databases (Technical Report)

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 15.92 Mbs, the file-s for this book were downloaded 61 times, the file-s went public at Fri Sep 20 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Scalable Probabilistic Similarity Ranking In Uncertain Databases (Technical Report) at online marketplaces:


20Consistency Checking And Querying In Probabilistic Databases Under Integrity Constraints

By

We address the issue of incorporating a particular yet expressive form of integrity constraints (namely, denial constraints) into probabilistic databases. To this aim, we move away from the common way of giving semantics to probabilistic databases, which relies on considering a unique interpretation of the data, and address two fundamental problems: consistency checking and query evaluation. The former consists in verifying whether there is an interpretation which conforms to both the marginal probabilities of the tuples and the integrity constraints. The latter is the problem of answering queries under a "cautious" paradigm, taking into account all interpretations of the data in accordance with the constraints. In this setting, we investigate the complexity of the above-mentioned problems, and identify several tractable cases of practical relevance.

“Consistency Checking And Querying In Probabilistic Databases Under Integrity Constraints” Metadata:

  • Title: ➤  Consistency Checking And Querying In Probabilistic Databases Under Integrity Constraints
  • Authors:
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 36.93 Mbs, the file-s for this book were downloaded 59 times, the file-s went public at Mon Sep 23 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Consistency Checking And Querying In Probabilistic Databases Under Integrity Constraints at online marketplaces:


21On A Theory Of Probabilistic Deductive Databases

By

We propose a framework for modeling uncertainty where both belief and doubt can be given independent, first-class status. We adopt probability theory as the mathematical formalism for manipulating uncertainty. An agent can express the uncertainty in her knowledge about a piece of information in the form of a confidence level, consisting of a pair of intervals of probability, one for each of her belief and doubt. The space of confidence levels naturally leads to the notion of a trilattice, similar in spirit to Fitting's bilattices. Intuitively, thep oints in such a trilattice can be ordered according to truth, information, or precision. We develop a framework for probabilistic deductive databases by associating confidence levels with the facts and rules of a classical deductive database. While the trilattice structure offers a variety of choices for defining the semantics of probabilistic deductive databases, our choice of semantics is based on the truth-ordering, which we find to be closest to the classical framework for deductive databases. In addition to proposing a declarative semantics based on valuations and an equivalent semantics based on fixpoint theory, we also propose a proof procedure and prove it sound and complete. We show that while classical Datalog query programs have a polynomial time data complexity, certain query programs in the probabilistic deductive database framework do not even terminate on some input databases. We identify a large natural class of query programs of practical interest in our framework, and show that programs in this class possess polynomial time data complexity, i.e., not only do they terminate on every input database, they are guaranteed to do so in a number of steps polynomial in the input database size.

“On A Theory Of Probabilistic Deductive Databases” Metadata:

  • Title: ➤  On A Theory Of Probabilistic Deductive Databases
  • Authors:
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 21.06 Mbs, the file-s for this book were downloaded 82 times, the file-s went public at Mon Sep 23 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find On A Theory Of Probabilistic Deductive Databases at online marketplaces:


22A Novel Probabilistic Pruning Approach To Speed Up Similarity Queries In Uncertain Databases

By

In this paper, we propose a novel, effective and efficient probabilistic pruning criterion for probabilistic similarity queries on uncertain data. Our approach supports a general uncertainty model using continuous probabilistic density functions to describe the (possibly correlated) uncertain attributes of objects. In a nutshell, the problem to be solved is to compute the PDF of the random variable denoted by the probabilistic domination count: Given an uncertain database object B, an uncertain reference object R and a set D of uncertain database objects in a multi-dimensional space, the probabilistic domination count denotes the number of uncertain objects in D that are closer to R than B. This domination count can be used to answer a wide range of probabilistic similarity queries. Specifically, we propose a novel geometric pruning filter and introduce an iterative filter-refinement strategy for conservatively and progressively estimating the probabilistic domination count in an efficient way while keeping correctness according to the possible world semantics. In an experimental evaluation, we show that our proposed technique allows to acquire tight probability bounds for the probabilistic domination count quickly, even for large uncertain databases.

“A Novel Probabilistic Pruning Approach To Speed Up Similarity Queries In Uncertain Databases” Metadata:

  • Title: ➤  A Novel Probabilistic Pruning Approach To Speed Up Similarity Queries In Uncertain Databases
  • Authors: ➤  
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 14.72 Mbs, the file-s for this book were downloaded 90 times, the file-s went public at Sun Sep 22 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find A Novel Probabilistic Pruning Approach To Speed Up Similarity Queries In Uncertain Databases at online marketplaces:


23Probabilistic Databases With MarkoViews

In this paper, we propose a novel, effective and efficient probabilistic pruning criterion for probabilistic similarity queries on uncertain data. Our approach supports a general uncertainty model using continuous probabilistic density functions to describe the (possibly correlated) uncertain attributes of objects. In a nutshell, the problem to be solved is to compute the PDF of the random variable denoted by the probabilistic domination count: Given an uncertain database object B, an uncertain reference object R and a set D of uncertain database objects in a multi-dimensional space, the probabilistic domination count denotes the number of uncertain objects in D that are closer to R than B. This domination count can be used to answer a wide range of probabilistic similarity queries. Specifically, we propose a novel geometric pruning filter and introduce an iterative filter-refinement strategy for conservatively and progressively estimating the probabilistic domination count in an efficient way while keeping correctness according to the possible world semantics. In an experimental evaluation, we show that our proposed technique allows to acquire tight probability bounds for the probabilistic domination count quickly, even for large uncertain databases.

“Probabilistic Databases With MarkoViews” Metadata:

  • Title: ➤  Probabilistic Databases With MarkoViews

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 29.19 Mbs, the file-s for this book were downloaded 99 times, the file-s went public at Fri Sep 20 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Probabilistic Databases With MarkoViews at online marketplaces:


24Scalable Probabilistic Databases With Factor Graphs And MCMC

By

Probabilistic databases play a crucial role in the management and understanding of uncertain data. However, incorporating probabilities into the semantics of incomplete databases has posed many challenges, forcing systems to sacrifice modeling power, scalability, or restrict the class of relational algebra formula under which they are closed. We propose an alternative approach where the underlying relational database always represents a single world, and an external factor graph encodes a distribution over possible worlds; Markov chain Monte Carlo (MCMC) inference is then used to recover this uncertainty to a desired level of fidelity. Our approach allows the efficient evaluation of arbitrary queries over probabilistic databases with arbitrary dependencies expressed by graphical models with structure that changes during inference. MCMC sampling provides efficiency by hypothesizing {\em modifications} to possible worlds rather than generating entire worlds from scratch. Queries are then run over the portions of the world that change, avoiding the onerous cost of running full queries over each sampled world. A significant innovation of this work is the connection between MCMC sampling and materialized view maintenance techniques: we find empirically that using view maintenance techniques is several orders of magnitude faster than naively querying each sampled world. We also demonstrate our system's ability to answer relational queries with aggregation, and demonstrate additional scalability through the use of parallelization.

“Scalable Probabilistic Databases With Factor Graphs And MCMC” Metadata:

  • Title: ➤  Scalable Probabilistic Databases With Factor Graphs And MCMC
  • Authors:
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 23.27 Mbs, the file-s for this book were downloaded 112 times, the file-s went public at Fri Jul 19 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Scalable Probabilistic Databases With Factor Graphs And MCMC at online marketplaces:


25Aggregation In Probabilistic Databases Via Knowledge Compilation

By

This paper presents a query evaluation technique for positive relational algebra queries with aggregates on a representation system for probabilistic data based on the algebraic structures of semiring and semimodule. The core of our evaluation technique is a procedure that compiles semimodule and semiring expressions into so-called decomposition trees, for which the computation of the probability distribution can be done in time linear in the product of the sizes of the probability distributions represented by its nodes. We give syntactic characterisations of tractable queries with aggregates by exploiting the connection between query tractability and polynomial-time decomposition trees. A prototype of the technique is incorporated in the probabilistic database engine SPROUT. We report on performance experiments with custom datasets and TPC-H data.

“Aggregation In Probabilistic Databases Via Knowledge Compilation” Metadata:

  • Title: ➤  Aggregation In Probabilistic Databases Via Knowledge Compilation
  • Authors:
  • Language: English

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 14.16 Mbs, the file-s for this book were downloaded 62 times, the file-s went public at Wed Sep 18 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Aggregation In Probabilistic Databases Via Knowledge Compilation at online marketplaces:


26Defining And Mining Functional Dependencies In Probabilistic Databases

By

Functional dependencies -- traditional, approximate and conditional are of critical importance in relational databases, as they inform us about the relationships between attributes. They are useful in schema normalization, data rectification and source selection. Most of these were however developed in the context of deterministic data. Although uncertain databases have started receiving attention, these dependencies have not been defined for them, nor are fast algorithms available to evaluate their confidences. This paper defines the logical extensions of various forms of functional dependencies for probabilistic databases and explores the connections between them. We propose a pruning-based exact algorithm to evaluate the confidence of functional dependencies, a Monte-Carlo based algorithm to evaluate the confidence of approximate functional dependencies and algorithms for their conditional counterparts in probabilistic databases. Experiments are performed on both synthetic and real data evaluating the performance of these algorithms in assessing the confidence of dependencies and mining them from data. We believe that having these dependencies and algorithms available for probabilistic databases will drive adoption of probabilistic data storage in the industry.

“Defining And Mining Functional Dependencies In Probabilistic Databases” Metadata:

  • Title: ➤  Defining And Mining Functional Dependencies In Probabilistic Databases
  • Authors:

Edition Identifiers:

Downloads Information:

The book is available for download in "texts" format, the size of the file-s is: 11.28 Mbs, the file-s for this book were downloaded 98 times, the file-s went public at Fri Jul 19 2013.

Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -

Related Links:

Online Marketplaces

Find Defining And Mining Functional Dependencies In Probabilistic Databases at online marketplaces:


Buy “Probabilistic Databases” online:

Shop for “Probabilistic Databases” on popular online marketplaces.