-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional Core/Peripheral Classification Methods #276
base: dev
Are you sure you want to change the base?
Conversation
Base implementation for new classification metrics. Documentation and testing still missing. Signed-off-by: Leo Sendelbach <[email protected]>
Tests use already existing network, this test cases are quite small. Additional research into potential rounding errors may be required. Signed-off-by: Leo Sendelbach <[email protected]>
Add default documentation, same as for already existing classification methods Signed-off-by: Leo Sendelbach <[email protected]>
add new entry under 'unversioned" Signed-off-by: Leo Sendelbach <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick look at the implementation (but not yet at the tests).
Please find my initial comments below.
@@ -96,7 +101,7 @@ CLASSIFICATION.TYPE.TO.CATEGORY = list( | |||
#' Network-based options/metrics (parameter \code{network} has to be specified): | |||
#' - "network.degree" | |||
#' - "network.eigen" | |||
#' - "network.hierarchy" | |||
#' - "network.hierarchy" ###TODO check all documentation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't forget about this TODO 😉
## since core developers are expected to have a lower eccentricity, | ||
## we need to invert all non-zero values | ||
indices = which(eccentricity.vec > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the description it is not clear what happens for zero values...
} else if (type == "network.closeness") { | ||
closeness.centrality.vec = igraph::closeness(network) | ||
## Construct centrality dataframe | ||
centrality.dataframe = data.frame(author.name = names(closeness.centrality.vec), | ||
centrality = as.vector(closeness.centrality.vec)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the mode
parameter for closeness. For the degree, we use "all"
(if I am not mistaken, please check what we really use), but for closeness, the default seems to be "out"
. While this looks like an inconsistency in igraph that both functions have different default values, I am not sure whether there is an actual reason why closeness has "out"
as default.
Could you please check that with igraph documentation and with small examples of directed networks whether we should use "out"
or "all"
here? In general, I would like to preserve consistency, but there might be reasons to deviate from consistency 😉
get.author.class.network.betweenness = function(network, result.limit = NULL, | ||
restrict.classification.to.authors = NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation wrong. (Also applies to some of the functions below.)
@@ -15,6 +15,7 @@ | |||
- Add commit network as a new type of network. It uses commits as vertices and connects them either via cochange or commit interactions. This includes adding new config parameters and the function `add.vertex.attribute.commit.network` for adding vertex attributes to a commit network (PR #263, ab73271781e8e9a0715f784936df4b371d64c338, ab73271781e8e9a0715f784936df4b371d64c338, cd9a930fcb54ff465c2a5a7c43cfe82ac15c134d) | |||
- Add `remove.duplicate.edges` function that takes a network as input and conflates identical edges (PR #268, d9a4be417b340812b744f59398ba6460ba527e1c, 0c2f47c4fea6f5f2f582c0259f8cf23af985058a, c6e90dd9cb462232563f753f414da14a24b392a3) | |||
- Add `cumulative` as an argument to `construct.ranges` which enables the creation of cumulative ranges from given revisions (PR #268, a135f6bb6f83ccb03ae27c735c2700fccc1ee0c8, 8ec207f1e306ef6a641fb0205a9982fa89c7e0d9) | |||
- Add four new metric which can be used for the classification of authors into core and peripheral: Betweenness, Closeness, Pagerank and Eccentricity (PR #276, 65d5c9cc86708777ef458b0c2e744ab4b846bdd1, b392d1a125d0f306b4bce8d95032162a328a3ce2, c5d37d40024e32ad5778fa5971a45bc08f7631e0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metric ➡️ metrics
which ➡️ that
And there is no need to capitalize the metrics' names. But please put a comma before the final occurrence of "and".
Prerequisites
showcase.R
with respect to my changes.dev
.Description
Add four new metric which can be used for the classification of authors into core and peripheral:
Betweenness, which measures the number of shortest paths between developers that go through a given developer vertex;
Closeness, which measures how close a developer is to all others by taking the inverse of the sum of all of it's shortest paths;
Pagerank, which is based on Google's Pagerank algorithm, which is closely related to Eigenvector Centrality;
Eccentricity, which measures the distance to the furthest developer vertex.
Changelog
Added