DBWeb — Internships, PhD or Post-Doc Positions, Student Projects

These topics are just given as indications. If you are interested by any of these or by similar topics, please contact the indicated researcher.

PhD Position on Machine Learning for the Web Graph & Social networks

Duration: 3 years

Level: PhD

Proposed by: Michalis Vazirgiannis

Applications are invited for a Ph.D. position, starting beginning of 2012, in the context of the DIGITEO project LEVETONE focusing on advanced Web Mining research. The project is joint among top French Universities in the context of the ParisTech alliance with the support of local industrial partners. The project is run and supervised by Prof. M. Vazirgiannis.

The Research area is “Machine Learning for the Web Graph & Social networks”. More specifically the research is focused along the following axes:

  • Advanced methods for graph mining in social networks
  • Real time personalization for mobile devices

Requirements

Prospective applicants should have

  • a B.Sc. and a Master degree in the following areas: Mathematics, Physics, Computer Science/Engineering
  • experience in mathematical programming and relevant tools
  • analytical skills and creative thinking with a hard working attitude
  • a very good command in English (proved by relevant internationally approved tests)

Funding: Full funding for 3 years is available.

Applications

Interested graduate students should send by email

  • a cover letter including a brief presentation of their academic record, the motivation and the skills of the candidate
  • a full CV
  • a list of at least 2 academic/industrial references (names and contact information only, not the actual letters)

The above should be e-mailed in a compressed file named after your surname (i.e., <surname>.rar) to Prof. M. Vazirgiannis.

Location

This position is joint between LIX/École polytechnique and the INFRES department of Télécom ParisTech, members of the ParisTech alliance. See further details at Prof. M. Vazirgiannis's Web page.

Post-doctoral position on Machine Learning for the Web Graph & Social networks

Duration: 1 year

Level: PostDoc

Proposed by: Michalis Vazirgiannis

Applications are invited for a Post-doc position, available in the period 2011-2013, in the context of the DIGITEO project LEVETONE focusing on advanced Web Mining research. The project is joint among top French Universities in the context of the ParisTech alliance with the support of local industrial partners. The project is run and supervised by Prof. M. Vazirgiannis.

The Research area is “Machine Learning for the Web Graph & Social networks”. More specifically the research is focused along the following axes:

  • Advanced methods for graph mining in social networks
  • Real time personalization for mobile devices

Requirements

Prospective applicants should have

  • a recent Ph.D. degree in the following areas: Computer Science/Engineering, Computational Mathematics, Physics
  • experience in data management, mathematical programming and relevant tools
  • analytical skills and creative thinking with a hard working attitude
  • a sound publication record.

Funding: Full funding for up to 12 months is available.

Applications

Interested graduate students should send by email

  • a cover letter including a brief presentation of their academic record, the motivation and the skills of the candidate
  • a full CV
  • a list of at least 2 academic/industrial references (names and contact information only, not the actual letters)

The above should be e-mailed in a compressed file named after your surname (i.e., <surname>.rar) to Prof. M. Vazirgiannis.

Location

This position is joint between LIX/École polytechnique and the INFRES department of Télécom ParisTech, members of the ParisTech alliance. See further details at Prof. M. Vazirgiannis's Web page.

Social web exploration

Duration: 2–6 months

Level: MSc or Engineering Student

Proposed by: Georges Gouriten and Pierre Senellart

The intern will be a part of the ARCOMEM project, a 3-year European project bringing together twelve private and public partners from seven different countries, in order to build an intelligent social web archiving tool. More specifically, he will join our team working on new approaches to web exploration. The main project the intern will be involved in concerns the exploration (crawling) of the social web.

On social networks, standard HTML crawling is not always possible and it is sometimes compulsory or more convenient to get the data from specific service calls (APIs). However, these APIs usually have restrictive limitations, in terms of the number of requests per hour. In the perspective of archiving social data, we want to address this challenging research problem: how to optimize the amount of data that can be accessed under specific API constraints? This will be the main research project the intern will work on.

Other options related to ARCOMEM are open too. We are developing a framework to simplify interactions with multiple APIs and we face interesting technical issues. There are also questions on the use of social data in the integrated project. Web crawling usually consists in recursively exploring all the different links extracted from web pages; for ARCOMEM, we are developing intelligent modules to prioritize web pages that are deemed interesting relatively to their content and context, in particular the social context.

We hope this gives you some ideas of the research and development challenges. The work on the crawler has already resulted in many exciting innovations and the many dimensions of the ARCOMEM project opens the possibility of various research opportunities.

The ideal candidate

We are very open-minded about applications, our only firm requirement is some previous programming experience. The main qualities we are looking for are: interest in computer science, motivation, innovative thinking, openness to collaboration, and proactive mindset. Candidates are expected to be proficient in English.

What make probabilistic data efficiently queriable?

Duration: 6 months

Level: MSc

Proposed by: Pierre Senellart

Probabilistic databases are compact representations of probability distributions over regular databases. A number of models have been proposed for probabilistic data, both in the relational and the XML settings. Evaluating a Boolean query over a probabilistic database amounts to computing the probability that this Boolean query matches in the probability distribution. One crucial question is whether query evaluation remains tractable on probabilistic databases. A number of research works have looked at characteristics of queries that may make them tractable: thus, queries without self-joins are tractable over tuple-independent databases if and only if they are hierarchical, while tree-pattern queries with a single join are tractable if and only if they are equivalent to a join-free query. The objective of this internship is to take the problem from the other side: identifying classes of data for which queries are tractable. One direction is to look at bounding the treewidth of the data; another is to try finding join patterns that make querying easy.

References

For any question regarding this website, please contact dbweb@telecom-paristech.fr.