Skip to content

Relational Random Walks

Alexander L. Hayes edited this page Oct 25, 2017 · 5 revisions

Lifted Relational Random Walks

by: Navdeep Kaur

Table of Contents:

  1. Introduction
  2. [Parameters]

Introduction

Schema of relational data can often be represented as a lifted graph where the nodes of the graph represent entity types and edges of the graph represent relations existing between two entities that it is connecting. A random walk on such graph may result in exploration of some interesting structure present in the relational schema. For example, the following random walk can be interpreted as a person taught a course that has a courseLevel of levelid:

Consider the random walk: personid->taught->courseid->courseLevel->levelid

The above random walk can be converted into clausal form as:

taught(personid,courseid) ᴧ courseLevel(courseid,levelid)

Lifted Relational Random Walks has been integrated into RDN-Boost to obtain such interesting structures in relational domains.

Parameters:

Random Walks can be obtained by running the following command in RDN-Boost:

java –cp edu.iu.cs.RelationalRandomWalks.RunRelationalRandomWalks -rw -train "./facts.txt" -startentity "personid" -endentity "personid" -maxRWlen 6

As shown above the following flags need to be set:

-rw : this flag needs to be set to perform relational random walks.

-startentity: this flag is set to inform about the entity type from which the random walk should always start, for instance personid in the above example.

-endentity: Set this flag to inform about the entity type at which the random walk should always end, e.g., levelid in above example.

-maxRWlen: This flag should be used to set the maximum length (number of relations) of any random walks.

-train: this flag will set the path to schema file.

Input File:

The input file will consist of the schema of the relational dataset. An example of schema to be input to the system is shown as follows:

courseLevel(courseid,levelid)|NoBF

student(personid,tudentype)|NoBF

professor(personid,rofessortype)|NoBF

inPhase(personid,haseid)|NoBF

yearsInProgram(personid,yearid)|NoBF

hasPosition(personid,ositiontype)|NoBF

B_taughtBy(courseid,personid)|NoTwin|NoBB

Setting Flags in Input File:

As can be seen from the above examples, some flags can be set in schema file after vertical bars (|) for each relation. For more information on importance of these flags, please refer to [1]. This code supports the following flags: NoTwin: This code allows an inverse relation for every relation present in schema file, which is represented by putting an underscore (_) character in front relation. For e.g. inverse of courseLevel(coursid,levelid) will be represented as _courseLevel(levelid, courseid) such that courseLevel and _courseLevel are two distinct relations. Setting NoTwin disallows the inverse of a relation to be present in random walks.

NoBB: Setting this flags will restrict an inverse relation to immediately follow itself in random walk.

NoFF: Setting this flag will restrict an ‘non-inverse’ relation to immediately follow itself in random walk.

NoFB: Setting this flag will restrict an inverse relation to immediately follow its ‘non-inverse’ counterpart in random walk.

NoBF: Setting this flag will restrict an ‘non-inverse’ relation to immediately follow its inverse counterpart in random walk.

Caution: these flags are case sensitive. So set them carefully.

Output File:

The output will be stored in ‘RWRPredicates.txt’ file in the same folder as input file.

References:

  1. Ni Lao and William W. Cohen, “Relational Retrieval Using a Combination of Path-Constrained Random Walks”, ECML 2011.
Clone this wiki locally