Edit on Github

Aggregate Values Over a Dataset

In this section, we learn how to get Grakn to calculate the count, sum, max, mean, mean and median values of a specific set of data in the knowledge graph. To perform aggregation in Grakn, we first write a match clause to describe the set of data, then follow that by get to retrieve a distinct set of answers based on the specified variables, and lastly an aggregate function to perform on the variable of interest.

To try the following examples with one of the Grakn clients, follows these Clients Guide.

Count

We use the count function to get the number of the specified matched variable.

[tab:Graql] ```graql match $sce isa school-course-enrollment, has score $sco; $sco > 7.0; get; count; ``` [tab:end] [tab:Java] ```java GraqlGet.Aggregate query = Graql.match( var("sce").isa("school-course-enrollment").has("score", var("sco")), var("sco").gt(7.0) ).get().count(); ``` [tab:end]
[Note] When more than one variable follows the `get` keyword, the `count` function is applied on the unique set of the retrieved variables. This is also the case, when no variable follows `get`, which actually means all matched variables are included.

Sum

We use the sum function to get the sum of the specified long or double matched variable.

[tab:Graql] ```graql match $org isa organisation, has name $orn; $orn "Medicely"; ($org) isa employment, has salary $sal; get $sal; sum $sal; ``` [tab:end] [tab:Java] ```java GraqlGet.Aggregate query = Graql.match( var("org").isa("organisation").has("name", var("orn")), var("orn").val("Medicely"), var().rel("org").isa("employment").has("salary", var("sal")) ).get("sal").sum("sal"); ``` [tab:end]

Maximum

We use the max function to get the maximum value among the specified long or double matched variable.

[tab:Graql] ```graql match $sch isa school, has ranking $ran; get $ran; max $ran; ``` [tab:end] [tab:Java] ```java GraqlGet.Aggregate query = Graql.match( var("sch").isa("school").has("ranking", var("ran")) ).get("ran").max("ran"); ``` [tab:end]

Minimum

We use the min function to get the minimum value among the specified long or double matched variable.

[tab:Graql] ```graql match ($per) isa marriage; ($per) isa employment, has salary $sal; get $sal; min $sal; ``` [tab:end] [tab:Java] ```java GraqlGet.Aggregate query = Graql.match( var().rel(var("per")).isa("marriage"), var().rel(var("per")).isa("employment").has("salary", var("sal")) ).get("sal").min("sal"); ``` [tab:end]

Mean

We use the mean function to get the average value of the specified long or double matched variable.

[tab:Graql] ```graql match $emp isa employment, has salary $sal; get $sal; mean $sal; ``` [tab:end] [tab:Java] ```java GraqlGet.Aggregate query = Graql.match( var("emp").isa("employment").has("salary", var("sal")) ).get("sal").mean("sal"); ``` [tab:end]

Median

We use the median function to get the median value among the specified long or double matched variable.

[tab:Graql] ```graql match $org isa organisation, has name $orn; $orn == "Facelook"; (employer: $org, employee: $per) isa employment; ($per) isa school-course-enrollment, has score $sco; get $sco; median $sco; ``` [tab:end] [tab:Java] ```java GraqlGet.Aggregate query = Graql.match( var("org").isa("organisation").has("name", var("orn")), var("orn").val("Facelook"), var().rel("employer", var("org")).rel("employee", var("per")).isa("employment"), var().rel(var("per")).isa("school-course-enrollment").has("score", var("sco")) ).get("sco").median("sco"); ``` [tab:end]

Grouping Answers

We use the group function, optionally followed by another aggregate function, to group the answers by the specified matched variable.

[tab:Graql] ```graql match $per isa person; $scc isa school-course, has title $tit; (student: $per, enrolled-course: $scc) isa school-course-enrollment; get; group $tit; ``` [tab:end] [tab:Java] ```java GraqlGet.Group query = Graql.match( var("per").isa("person"), var("scc").isa("school-course").has("title", var("tit")), var().rel("student", var("per")).rel("enrolled-course", var("scc")).isa("school-course-enrollment") ).get().group("tit"); ``` [tab:end]

This query returns all instances of person grouped by the title of their school-course.

[tab:Graql] ```graql match $per isa person; $scc isa school-course, has title $tit; (student: $per, enrolled-course: $scc) isa school-course-enrollment; get; group $tit; count; ``` [tab:end] [tab:Java] ```java GraqlGet.Group.Aggregate query = Graql.match( var("per").isa("person"), var("scc").isa("school-course").has("title", var("tit")), var().rel("student", var("per")).rel("enrolled-course", var("scc")).isa("school-course-enrollment") ).get().group("tit").count(); ``` [tab:end]

This query returns the total count of persons grouped by the title of their school-course.

Clients Guide

[Note] **For those developing with Client [Java](../client-api/java)**: Executing a `aggregate` query, is as simple as calling the [`execute()`](../client-api/java#eagerly-execute-a-graql-query) method on a transaction and passing the query object to it.
[Note] **For those developing with Client [Node.js](../client-api/nodejs)**: Executing a `aggregate` query, is as simple as passing the Graql(string) query to the [`query()`](../client-api/nodejs#lazily-execute-a-graql-query) function available on the [`transaction`](../client-api/nodejs#transaction) object.
[Note] **For those developing with Client [Python](../client-api/python)**: Executing a `aggregate` query, is as simple as passing the Graql(string) query to the [`query()`](../client-api/python#lazily-execute-a-graql-query) method available on the [`transaction`](../client-api/python#transaction) object.

Summary

We use an aggregate query to calculate a certain variable as defined in the preceded match clause that describes a set of data in the knowledge graph.

Next, we learn how to compute values over a large set of data in a knowledge graph.