<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE projects PUBLIC "GROUP DTD" "http://dbweb.enst.fr/group.dtd">
<projects xmlns:h="http://www.w3.org/1999/xhtml">
  <introduction>Data managed by information systems are more and more complex,
    distributed, heterogeneous, dynamic, and of various forms. The DBWeb project focus on 
    the fundamental issues raised in modern data and knowledge
    management systems, especially on the Web or in collaborative
    contexts oriented towards peer-to-peer networks. Our research interests
    cover both theoretical fundations of database management systems,
    practical solutions and applications, as well as cognitive aspects.
    Here are the main research projects we are involved in.</introduction>
      
  <project id="trust">
    <title>Trust Managment in Open Communities and Online Social Networks</title>
    <acronym>ISICIL</acronym>
    <logo href="logos/isicil.png"/>
    <homepage>http://isicil.inria.fr/</homepage>
    <duration from="2009" to="2011"/>
    <people>
      <internal_ref ref="imen"/>
      <internal_ref ref="talel"/>
      <internal_ref ref="bogdan"/>
      <internal_ref ref="silviu"/>
    </people>
    <abstract>
Our contribution in the ISICIL project concerns trust managment in open communities and online social networks. It is often necessary in data management
applications to control the ways in which data is accessed, modified and transformed. When data is under
centralized control, arbitrarily complex restriction scenarios can be actively enforced inside the boundaries of
the owner. All this becomes much harder when data cannot be actively controlled and monitored, for instance
when it is shared in a distributed and open context such as large social networks for information and knowledge
sharing. The management of trust and privacy is becoming crucial in many applications, like the collaborative
publishing of information (Wikipedia, open software communities, e-bay) or social networks applications.

Many novel issues are raised in such contexts and one of our objectives is to study appropriate models and tools
for trust and privacy management. More precisely, the ISICIL project must innovate on the following points: (1) Better
understanding of the use of trust models and their limits in open communities. (2) Developing suitable models for privacy based on trust measures in open communities for
information sharing and publishing. In particular, these models should allow data owners to preserve their
anonymity and to control how private information is disseminated, accessed or modified.
    </abstract>
    <funding>The ISICIL project (2009–2011) is sponsored by the French
      national research agency ANR (Agence Nationale de la Recherche),
      within the programme Content and Interactions (CONTINT).</funding>
  </project>
  <project id="StructuredWeb">
    <title>Extraction and Querying of Complex Objects from the Structured Web</title>
    <acronym>DEUS</acronym>
    <people>
      <internal_ref ref="talel"/>
      <internal_ref ref="bogdan"/>
      <internal_ref ref="pierre"/>
      <internal_ref ref="nora"/>
    </people>
    <abstract>
We are witnessing today a tremendous growth of the so called structured Web, 
in which documents are no longer quasi-textual, but are data-centric, presenting structured content, complex objects, 
a Web in which information is no longer served “raw”.  
While current search platforms have mostly benefited from top research in information retrieval 
and distributed systems, the shift towards schematized data calls for more precise, richer querying of the 
Web and raises new challenges to which the data management community can provide answers.

We believe that a key challenge for future Web interrogation is to leverage the structured part of the Web, for better understanding, extraction and access to data that would enable richer search interactions. 

Expanding beyond entity search applications, which focus on simple, atomic objects, and building on the idea 
that such simple, easily recognizable entities can be further organized into more complex relations or objects, 
often with spatio-temporal components, denoted structured object descriptions, in short SODs, we intend to study in this project the theoretical and practical challenges that are raised in querying for complex objects on the structured Web.
    </abstract>
  </project>
  <project id="artificial_intelligence">
    <title>Artificial intelligence and cognitive aspects</title>
    <acronym>MILC</acronym>
    <homepage>http://www.infres.enst.fr/~milc/Research.html</homepage>
    <people>
      <member_ref ref="jean-louis"/>
      <member_ref ref="antoine"/>
      <member_ref ref="damien"/>
      <member_ref ref="nicoleta"/>
    </people>
    <abstract>This research on language and cognition (MILC sub-project) focuses 
      on the quest for fundamental principles 
      underlying the language faculty and the will to communicate. 
      The main areas of interest are currently relevance and honest communication.
    </abstract>
  </project>
  <project id="webdam">
    <title>Foundations of Web data management</title>
    <acronym>Webdam</acronym>
    <logo href="logos/webdam.png"/>
    <homepage>http://webdam.inria.fr/wordpress/</homepage>
    <duration from="2009" to="2013"/>
    <people>
      <member_ref ref="pierre"/>
      <member_ref ref="marilena"/>
      <member_ref ref="fabian"/>
      <member_ref ref="yael"/>
    </people>
    <consortium>
      <institution_ref ref="inria_saclay"/>
    </consortium>
    <abstract>The goal of the Webdam project, headed by Serge Abiteboul
      from INRIA Sacalay, is to develop a formal model
      for Web data management. This model will open new horizons for the
      development of the Web in a well-principled way, enhancing its
      functionality, performance, and reliability. Specifically, the goal
      is to develop a universally accepted formal framework for
      describing complex and flexible interacting Web applications
      featuring notably data exchange, sharing, integration, querying and
      updating. We also propose to develop formal foundations that will
      enable peers to concurrently reason about global data management
      activities, cooperate in solving specific tasks and support
      services with desired quality of service. Although the proposal
      addresses fundamental issues, its goal is to serve as the basis for
      future software development for Web data management.</abstract>
    <funding>The Webdam project is funded by the European Research
      Council under the European Community’s Seventh Framework Programme
      (FP7/2007-2013) / ERC grant Webdam, agreement n° 226513.</funding>
  </project>
  <project id="dataring">
    <title>P2P Data Sharing for Online Communities</title>
    <acronym>DataRing</acronym>
    <logo href="logos/dataring.jpg"/>
    <homepage>http://www.lina.univ-nantes.fr/projets/DataRing/</homepage>
    <duration from="2009" to="2012"/>
    <people>
      <member_ref ref="talel"/>
      <member_ref ref="asma"/>
      <member_ref ref="pierre"/>
      <member_ref ref="lamine"/>
    </people>
    <consortium>
      <institution_ref ref="inria_saclay"/>
      <institution_ref ref="inria_rennes"/>
      <institution_ref ref="lig"/>
      <institution_ref ref="lirmm"/>
    </consortium>
    <abstract>
      The DataRing project addresses the problem of P2P data sharing for
      online communities, by offering a high-level network ring across
      distributed data  source owners. Users may be in high numbers and
      interested in different kinds of collaboration and sharing their
      knowledge, ideas, experiences, etc. Data sources can be in high
      numbers, fairly autonomous, i.e. locally owned and controlled, and
      highly heterogeneous with different semantics and structures. What
      we need then is new decentralized data management techniques that
      scale up while addressing the autonomy, dynamic behavior and
      heterogeneity of both users and data sources.</abstract>
    <funding>The DataRing project (2009–2011) is sponsored by the French
      national research agency ANR (Agence Nationale de la Recherche),
      within the programme Future Networks and Services
      (VERSO).</funding>
  </project>
  <project>
    <title>Query-Driven Data Aquisition from Web Based Data Source</title>
    <acronym>WebPlan</acronym>
    <homepage>http://web.comlab.ox.ac.uk/isg/projects/WebPlan/</homepage>
    <duration from="2010" to="2013"/>
    <people>
      <member_ref ref="pierre"/>
    </people>
    <consortium>
      <institution_ref ref="oxford"/>
    </consortium>
    <abstract><h:p>The functioning of entities as diverse as enterprises and
        government agencies depends on obtaining high-quality data.
        Increasingly these entities depend on external sources for their
        operational data: critical data is obtained dynamically via web
        services, is extracted from web pages, or is purchased from third
        parties. These sources can differ radically in their completeness,
        accuracy, and availability. It is not possible for applications to
        index and explore data from each source in advance of querying:
        there are too many sources, they are too costly to access, and the
        data in them may be refreshed constantly.</h:p>
      <h:p>
        How should data acquisition proceed in such situations?
        </h:p><h:p>
        In this project we will develop algorithms for answering queries in
        the presence of large numbers of web-based data sources, sources
        that may overlap substantially in their datasets but have different
        access restrictions and costs. Our approach will make use of schema
        information about the data an application is querying: data format,
        integrity constraints, and any prior knowledge of costs that may be
        available. The core of the project will be algorithms for answering
        a query by interactively exploring the sources, dynamically pruning
        out irrelevant or exhausted sources in the process.</h:p>
    </abstract>
    <funding>WebPlan is sponsored by the UK Engineering and Physical
      Sciences Research Council (EPSRC).</funding>
  </project>
  <project id="panic">
    <title>Pro-Activity of Audience and Digitization of Cultural Industries</title>
    <acronym>PANIC</acronym>
    <homepage>http://www.infres.enst.fr/wp/projetpanic</homepage>
    <duration from="2010" to="2012"/>
    <people>
      <member_ref ref="pierre"/>
    </people>
    <consortium>
      <institution_ref ref="cnam"/>
      <institution_ref ref="paris13"/>
      <institution_ref ref="orange-labs"/>
    </consortium>
    <abstract>
     This research project, which involves economists and social scientists,
     deals with the evolution of the audience of cultural media (books, press, games, music, etc.) with
     the advent of digitization and the Internet. In this project, we are involved in data cleaning and data
     enrichment tasks.
     </abstract>
    <funding>The PANIC project (2009–2011) is sponsored by the French
      national research agency ANR (Agence Nationale de la Recherche),
      within the programme Content and Interaction
      (CONTINT).</funding>
  </project>
  <project>
    <title>Answering relational/XML queries using views</title>
    <acronym>REWRITE</acronym>
    <duration from="2008" to="2011"/>
    <people>
      <internal_ref ref="bogdan"/>
    </people>
    <consortium>
      <institution_ref ref="ucsd"/>
    </consortium>
    <abstract>We study in this project the problem of querying data sources that accept only a
limited set of queries, such as sources accessible by Web services
which can implement very large (potentially infinite) families of
queries. We first revisit a classical setting in which the
application queries are conjunctive queries and the source accepts
families of (possibly parameterized) conjunctive
queries specified as the expansions of a
(potentially recursive) Datalog program with parameters, under the assumption that sources
 satisfy integrity constraints. 
 
We then consider XML queries and views. The standard approach for optimization of XPath queries by rewriting
using views techniques consists in navigating inside a view’s output, thus allowing 
the usage of only one view in the rewritten query. Algorithms for richer classes of XPath rewritings, 
using intersection
or joins on node identifiers, have been proposed, but they either lack
completeness guarantees, or require additional information about the
data. We study restrictions under which an XPath can
be rewritten in polynomial time using an intersection of views and effective algorithms that can work for 
any documents or type of identifiers.  Moreover, we are interested in the complexity
of the related problem of deciding if an XPath with intersection can
be equivalently rewritten as one without intersection or union.

Starting from our novel techniques for XML query answering using multiple views, we then study 
expressibility and support when a (potentially infinite) set
of views is specified using the QSS (Query Set Specification) formalism.
</abstract>
  </project>
  <project>
    <title>From Collect-All Archives to Community Memories –
Leveraging the Wisdom of the Crowds for Intelligent Preservation</title>
    <acronym>Arcomem</acronym>
    <logo href="logos/arcomem.png"/>
    <homepage>http://www.arcomem.eu/</homepage>
    <duration from="2010" to="2012"/>
    <people>
      <internal_ref ref="bogdan"/>
      <internal_ref ref="marilena"/>
      <internal_ref ref="pierre"/>
      <internal_ref ref="silviu"/>
      <internal_ref ref="talel"/>
      <internal_ref ref="georges"/>
      <internal_ref ref="faheem"/>
    </people>
    <consortium>
      <institution_ref ref="usfd"/>
      <institution_ref ref="luh"/>
      <institution_ref ref="yahoo"/>
      <institution_ref ref="imf"/>
      <institution_ref ref="southampton"/>
      <institution_ref ref="atc"/>
      <institution_ref ref="athena"/>
      <institution_ref ref="dw"/>
      <institution_ref ref="swr"/>
      <institution_ref ref="hep"/>
      <institution_ref ref="aup"/>
    </consortium>
    <abstract><h:p>Arcomem is about memory institutions like archives, museums, and libraries in the age of the
Social Web. Memory institutions are more important now than ever: as we face greater
economic and environmental challenges we need our understanding of the past to help us
navigate to a sustainable future. This is a core function of democracies, but this function faces
stiff new challenges in face of the Social Web, and of the radical changes in information
creation, communication and citizen involvement that currently characterise our information
society (e.g., there are now more social network hits than Google searches). Social media are
becoming more and more pervasive in all areas of life. In the UK, for example, it is now not
unknown for a government minister to answer a parliamentary question using Twitter, and this
material is both ephemeral and highly contextualised, making it increasingly difficult for a
political archivist to decide what to preserve.</h:p>
<h:p>This new world challenges the relevance and power of our memory institutions. To answer these
challenges, Arcomem’s aim is to:</h:p>
<h:ul><h:li>help transform archives into collective memories that are more tightly integrated with
their community of users</h:li>
<h:li>exploit Social Web and the wisdom of crowds to make Web archiving a more selective
and meaning-based process</h:li></h:ul>
<h:p>To do this we will provide innovative tools for archivists to help exploit the new media and
make our organisational memories richer and more relevant. We will do this in three ways:</h:p>
<h:ol>
<h:li> we will show how social media can help archivists select material for inclusion,
providing content appraisal via the social web</h:li>
<h:li> we will show how social media mining can enrich archives, moving towards
structured preservation around semantic categories</h:li>
<h:li> we will look at social, community and user-based archive creation methods</h:li>
</h:ol>
</abstract>
    <funding>The Arcomem project (2011–2013) is sponsored by the European
      Union, within the 7th framework programme on digital libraries and
      digital presevation.</funding>
</project>

<project>
    <title>Inferring trust networks from user interactions</title>
    <acronym>WikiSigned</acronym>
    <homepage>http://perso.telecom-paristech.fr/~maniu/wikisigned/</homepage>
    <people>
      <internal_ref ref="bogdan"/>
      <internal_ref ref="silviu"/>
      <internal_ref ref="talel"/>
    </people>
    <abstract><h:p>Large online communities that contribute and share content account nowadays for a significant and highly qualitative portion of the data on the Web. Examples of collaborative applications oriented towards building 
    repositories of quality user-generated content include online encyclopedias (Wikipedia, Knol), photo sharing sites (Flickr) or rating sites (Epinions). An important trend in such platforms aims at exploiting user relationships, links
    between users (e.g., social links), in order to improve core functionalities in the system. For instance, search, recommendation or access control can benefit from socially-driven approaches. This is especially the case when links can 
    be viewed as being signed, indicating a positive or negative attitude; possible meanings for positive links could be trust, friendship or similarity, while for negative links they could stand for distrust, opposition or antagonism. In 
    settings where explicit relationships do not exist, are sparse or are inadequate indicators of the attitude towards fellow members of the community, it becomes thus important to uncover implicit user inter-connections, positive or 
    negative links, from relevant user activities and their interactions.</h:p>
<h:p>This project aims at:</h:p>
<h:ul>
<h:li>designing methods to automatically extract signed networks from the social Web,</h:li>
<h:li>establishing sound metrics and methods for validating the inferred signed networks,</h:li>
<h:li>addressing the computational challenges inherent to the large-scale extraction and creation of the networks,</h:li>
<h:li>designing applications (e.g., recommendation systems, personalised search) that fully exploit the signed social links.</h:li>
</h:ul>
</abstract>
</project>

  <institution id="ucsd">
    <name>University of California San Diego</name>
    <homepage>http://db.ucsd.edu/People/alin/</homepage>
  </institution>
  
  <institution id="inria_saclay">
    <name>INRIA Saclay – Île-de-France</name>
    <homepage>http://www.inria.fr/saclay/</homepage>
  </institution>
  <institution id="lig">
    <name>LIG</name>
    <homepage>http://www.liglab.fr/</homepage>
  </institution>
  <institution id="lirmm">
    <name>LIRMM</name>
    <homepage>http://www.lirmm.fr/</homepage>
  </institution>
  <institution id="inria_rennes">
    <name>INRIA Rennes – Bretagne Atlantique</name>
    <homepage>http://www.inria.fr/rennes/</homepage>
  </institution>
  <institution id="oxford">
    <name>Oxford University Computing Laboratory</name>
    <homepage>http://web.comlab.ox.ac.uk/</homepage>
   </institution>
   <institution id="cnam">
    <name>Conservatoire national des Arts et Métiers</name>
      <homepage>http://www.cnam.fr/</homepage>
   </institution>
   <institution id="paris13">
    <name>Université Paris-Nord 13</name>
    <homepage>http://www.univ-paris13.fr/</homepage>
   </institution>
   <institution id="orange-labs">
    <name>Orange Labs</name>
    <homepage>http://www.orange.com/en_EN/innovation/</homepage>
   </institution>
   <institution id="usfd">
    <name>University of Sheffield</name>
    <homepage>http://www.sheffield.ac.uk/</homepage>
   </institution>
   <institution id="luh">
     <name>Leibniz Universität Hannover</name>
     <homepage>http://www.l3s.de/</homepage>
   </institution>
   <institution id="yahoo">
     <name>Yahoo! Iberia</name>
     <homepage>http://labs.yahoo.com/Yahoo_Labs_Barcelona</homepage>
   </institution>
   <institution id="imf">
     <name>Internet Memory Foundation</name>
     <homepage>http://internetmemory.org/</homepage>
   </institution>
   <institution id="southampton">
     <name>University of Southampton</name>
     <homepage>http://www.soton.ac.uk/</homepage>
   </institution>
   <institution id="atc">
     <name>Athens Technology Center</name>
     <homepage>http://www.atc.gr/</homepage>
   </institution>
   <institution id="athena">
     <name>Athena Research and Innovation Center in Information
       Communication and Knowledge Technologies</name>
     <homepage>http://www.imis.athena-innovation.gr/</homepage>
   </institution>
   <institution id="dw">
     <name>Deutsche Welle</name>
     <homepage>http://www.dw-world.de/</homepage>
   </institution>
   <institution id="swr">
     <name>Südwestrundfunk</name>
     <homepage>http://www.swr.de/</homepage>
   </institution>
   <institution id="hep">
     <name>Hellenic Parliament</name>
     <homepage>http://www.hellenicparliament.gr/</homepage>
   </institution>
   <institution id="aup">
     <name>Austrian Parliament</name>
     <homepage>http://www.parliament.gv.at/</homepage>
   </institution>
</projects>

