Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version v3.2 #126

Merged
merged 54 commits into from
May 17, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
8ebcf20
Enable contraction of imperfect ranges in the end
bockthom Apr 18, 2018
7c0183c
Update changelog
bockthom Apr 18, 2018
5f9c75a
Minor fixes from review in PR #117
bockthom Apr 19, 2018
15d2072
Adapt 'imperfect.range.ratio' in function 'construct.overlapping.ranges'
clhunsen Apr 20, 2018
33871a7
Update possible relations in showcase.R
bockthom Apr 23, 2018
5af8be4
Merge pull request #117 from bockthom/thomas-updates
clhunsen Apr 23, 2018
2f1b4d9
Add implemention of multi-edge-networks for author and artifact networks
ecklbarb Apr 17, 2018
b55d3e8
Adapt plot functions to multi-edge networks
ecklbarb Apr 17, 2018
f190ca1
Use color palette 'viridis' for plotting for better flexibility
ecklbarb Apr 17, 2018
7c628fb
Introduce the vertex attribute 'kind'
ecklbarb Apr 17, 2018
021ac8b
Change the 'simplify.network' function to handle multi-edge networks
ecklbarb Apr 18, 2018
784c417
Include 'relation' and 'kind' in the expected data of the test suite
ecklbarb Apr 18, 2018
7ad49c4
Remove vertex attribute 'id' and add vertex attribute 'artifact.type'
ecklbarb Apr 19, 2018
be6ee8c
Add tests for networks with multiple relations
ecklbarb Apr 19, 2018
2941c22
Set Copyright in test suite
ecklbarb Apr 20, 2018
cd4645f
Add description of edge and vertex attributes in 'README'
ecklbarb Apr 20, 2018
3e286ac
Bug fix: Set attribute 'artifact.type' correctly
ecklbarb Apr 23, 2018
c2e92c7
Bug fix: Plotting multi networks
ecklbarb Apr 23, 2018
e95afd0
Update changelog
ecklbarb Apr 20, 2018
ef094eb
Minor fixes from review in PR #115
ecklbarb Apr 26, 2018
d791df8
Change plot function: edge width depends on edge weight
ecklbarb Apr 26, 2018
86f39c5
Remove unneeded browser() statements
clhunsen Apr 26, 2018
84516fc
Streamline and improve vertex and edge attributes
clhunsen Apr 26, 2018
4dead55
Adjust resolution of vertex attribute 'kind'
clhunsen Apr 26, 2018
a9605c6
Rename artifact-vertex kind for mail relation to 'MailThread'
clhunsen Apr 26, 2018
78c7c87
Rename artifact type for issue relation to 'IssueComment'
clhunsen Apr 26, 2018
26f0c59
Fix plot legends
clhunsen Apr 26, 2018
0817dd8
Add additional parameters to network simplification functions
bockthom Apr 27, 2018
74bd090
Rename artifact type for issue relation to 'IssueEvent'
ecklbarb Apr 27, 2018
8833ae8
Minor fixes from review in PR #115
ecklbarb Apr 30, 2018
9e68293
Add comment
ecklbarb Apr 30, 2018
b5e03ac
Simplify resolution of vertex kind for artifact networks
clhunsen Apr 30, 2018
bc3c8c1
Adjust return value of 'construct.edge.list.from.key.value.list'
clhunsen Apr 30, 2018
2241817
Merge pull request #115 from ecklbarb/barbara-updates
bockthom Apr 30, 2018
992ddf8
Adapt pasta read method to also read lines without mapping
Apr 18, 2018
8139f34
Add method to classify developers with hierarchy
Apr 18, 2018
c0277c3
Fix bug that eigenvector centrality does not consider directedness
Apr 18, 2018
70d9b8b
Add pasta data to commits if configured
Apr 18, 2018
1400cc2
Fix typo in revision.set.id
Apr 18, 2018
ebf4958
Adapt pasta reading test
Apr 18, 2018
ab00c96
Verify the given ProjectConf object in the setter
Apr 26, 2018
00df306
Add equals method to the ProjectData class
Apr 26, 2018
13a5212
Add test for the equals method of ProjectData
Apr 26, 2018
df5ed1b
Add 'viridis' package also to install.R
bockthom May 4, 2018
9a5a2c1
Fix some minor documentation and coding style issues
May 7, 2018
6de0830
Add minor code improvements
May 9, 2018
08f716f
Add minor improvements to code and documentation
May 11, 2018
bfdabdd
Remove the data timestamps from the data sources for the equals method
May 11, 2018
5f2f925
Add changes of PR #124 to the Changelog file
May 11, 2018
7e5e6a5
Add equals method to the RangeData class
May 17, 2018
f8b5985
Restructure equals method in RangeData
May 17, 2018
80a83ad
Fix minor coding style and documentation issues
May 17, 2018
d56d5ae
Merge pull request #124 from hechtlC/master
clhunsen May 17, 2018
510863e
Version v3.2
clhunsen May 17, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,43 @@
# codeface-extraction-r – Changelog


## 3.2

### Added
- Handling of multiple relations for all types of networks (#98, #15, #11, PR #115)
- Allow several entries for the entries `author.relation` and `artifact.relation` in the `NetworkConf` (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- Add the mandatory edge attribute `relation` representing `author.relation` or `artifact.relation`, respectively, for all types of networks (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- Return data for several relations in `get.bipartite.relation` (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- Retain one edge for each available value of edge attribute `relation` during network simplification (021ac8b88e9a181364a51e89807df55cb741ed44)
- Add new tests and adapt existing ones (784c417c50eb1de5d0143908a390ead6ba22dbbf, 7ad49c4ad937c9a6c7398a45179e25d5d5c03faa, be6ee8cd48dc7692e02b7f1c512870591300fa8a)
- Add the mandatory vertex attribute `kind` describing the actual vertex kind (7c628fb93eb21f280c7d9da66680f817e107fa24, 784c417c50eb1de5d0143908a390ead6ba22dbbf)
- Respect new vertex and edge attributes in plot functions (b55d3e84a5f9b122dacd0ee52784d930f22d1f4b)
- Possibility to merge networks with function `merge.networks` (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- Possibility to merge edge and vertex lists with function `merge.network.data` (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- Add function `create.empty.edge.list` (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- Possibility to contract imperfect ranges in the end (#104, 8ebcf20b0aba0cb82dcd7e1d1b95e261a866d04e)
- Add method `ProjectData$equals` (#116, 00df306a3e6dbdeb81ddc116e88a4854b07afe72)
- Add author classification by hierarchy to the core-peripheral module (8139f34fd809d6750064514a549024df4cbf5863)

### Changed/Improved
- Remove the mandatory vertex attribute `artifact.type` due to inconsistent use ()
- Remove the mandatory vertex attribute `id` from artifact vertices due to inconsistent use (7ad49c4ad937c9a6c7398a45179e25d5d5c03faa)
- Streamline edge attribute `artifact.type` for uniformity ()
- Use color palette 'viridis' for plotting for better flexibility (f190ca130a15a82e5eed836e9ffc53b8a34aac20)
- Edge width in network plots now depends on edge weight, i.e., `width = 0.3 + 0.5 * log(weight)` (d791df8e2c41314f86c36b3af566141e7713f46c)
- Split function `construct.network.from.list` into the two functions `construct.edge.list.from.key.value.list` and `construct.network.from.edge.list` (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- Handle data for more than one relation in function `add.edges.for.bipartite.relation` (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- Retain one edge for each available value of edge attribute `relation` during network simplification (021ac8b88e9a181364a51e89807df55cb741ed44)
- Read also lines from the PaStA data without the `message.id` being mapped to a `commit.hash` (992ddf8d582a7a023f000b4fc57f9ff85a7f38f6)
- Add column `revision.set.id` to PaStA data to indicate which e-mails are concerned with the same patch (992ddf8d582a7a023f000b4fc57f9ff85a7f38f6)
- Add PaStA data to the unfiltered commit data if configured (70d9b8bd4cb16636086ca7ab90e817b89844f172)

### Fixed
- Check whether a given object to the `ProjectConf` setter in the `ProjectData` class really is a object of type `ProjectConf` (ab00c962e164428df2d59de7292eed3c3b1352aa)
- The method for eigenvector centrality now properly considers whether the network is directed or not (c0277c36e4ff45cfbb421317a42b6ea8736afe53)
- Fix a bug that caused errors when the core classification within a core-periphery classification is empty (c0277c36e4ff45cfbb421317a42b6ea8736afe53)


## 3.1.2

### Changed/Improved
Expand Down
26 changes: 22 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ When selecting a version to work with, you should consider the following points:
- `ggraph`: For plotting of networks (needs `udunits2` system library, e.g., `libudunits2-dev` on Ubuntu!)
- `markovchain`: For core/peripheral transition probabilities
- `lubridate`: For convenient date conversion and parsing
- `viridis`: For plotting of networks with nice colours


## How-To
Expand Down Expand Up @@ -214,7 +215,7 @@ Updates to the parameters can be done by calling `NetworkConf$update.variables(.
**Note**: Default values are shown in *italics*.

- `author.relation`
* The relation among authors, encoded as edges in an author network
* The relation(s) among authors, encoded as edges in an author network
* **Note**: The author--artifact relation in bipartite and multi networks is configured by `artifact.relation`!
* possible values: [*`"mail"`*, `"cochange"`, `"issue"`]
- `author.directed`
Expand All @@ -228,7 +229,7 @@ Updates to the parameters can be done by calling `NetworkConf$update.variables(.
* Remove all authors from an author network (including bipartite and multi networks) who are not present in an author network constructed with `artifact.relation` as relation, i.e., all authors that have no biparite relations in a bipartite/multi network are removed.
* [`TRUE`, *`FALSE`*]
- `artifact.relation`
* The relation among artifacts, encoded as edges in an artifact network
* The relation(s) among artifacts, encoded as edges in an artifact network
* **Note**: This relation configures also the author--artifact relation in bipartite and multi networks!
* possible values: [*`"cochange"`*, `"callgraph"`, `"mail"`, `"issue"`]
- `artifact.directed`
Expand All @@ -239,13 +240,14 @@ Updates to the parameters can be done by calling `NetworkConf$update.variables(.
* The list of edge-attribute names and information
* a subset of the following as a single vector:
- timestamp information: *`"date"`*, `"date.offset"`
- general information: *`"artifact.type"`*
- author information: `"author.name"`, `"author.email"`
- committer information: `"committer.date"`, `"committer.name"`, `"committer.email"`
- e-mail information: *`"message.id"`*, *`"thread"`*, `"subject"`
- commit information: *`"hash"`*, *`"file"`*, *`"artifact.type"`*, *`"artifact"`*, `"changed.files"`, `"added.lines"`, `"deleted.lines"`, `"diff.size"`, `"artifact.diff.size"`, `"synchronicity"`
- commit information: *`"hash"`*, *`"file"`*, *`"artifact"`*, `"changed.files"`, `"added.lines"`, `"deleted.lines"`, `"diff.size"`, `"artifact.diff.size"`, `"synchronicity"`
- PaStA information: `"pasta"`,
- issue information: *`"issue.id"`*, *`"event.name"`*, `"issue.state"`, `"creation.date"`, `"closing.date"`, `"is.pull.request"`
* **Note**: `"date"` is always included as this information is needed for several parts of the library, e.g., time-based splitting.
* **Note**: `"date"` and `"artifact.type"` are always included as this information is needed for several parts of the library, e.g., time-based splitting.
* **Note**: For each type of network that can be built, only the applicable part of the given vector of names is respected.
* **Note**: For the edge attributes `"pasta"` and `"synchronicity"`, the project configuration's parameters `pasta` and `synchronicity` need to be set to `TRUE`, respectively (see below).
- `simplify`
Expand All @@ -267,6 +269,22 @@ You can also update the `NetworkConf` object at any time by calling `NetworkBuil
For more examples, please look in the file `showcase.R`.


### Network properties

- Mandatory vertex attributes
* *`"type"`*: [`"Author"`, `"Artifact"`]
* *`"kind"`*: [`"Author"`,`"File"`, `"Feature"`, `"Function"`, `"MailThread"`,
`"Issue"`,`"FeatureExpression"`]
* *`"name"`*

- Mandatory edge attributes
* *`"type"`*: [`Unipartite`, `Bipartite`]
* *`"artifact.type"`*: [`"File"`, `"Feature"`, `"Function"`, `"Mail"`,
`"IssueEvent"`,`"FeatureExpression"`]
* *`"relation"`*: [`mail`, `cochange`, `issue`, `callgraph`] (from `artifact.relation` and `author.relation` attributes in the `NetworkConf` class)
* *`"date"`*


## File/Module overview

- `util-init.R`
Expand Down
7 changes: 4 additions & 3 deletions install.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
## to our needs.


pacakges = c(
packages = c(
"yaml",
"R6",
"igraph",
Expand All @@ -34,7 +34,8 @@ pacakges = c(
"ggplot2",
"ggraph",
"markovchain",
"lubridate"
"lubridate",
"viridis"
)


Expand All @@ -47,7 +48,7 @@ filter.installed.packages = function(packageList) {
}


p = filter.installed.packages(pacakges)
p = filter.installed.packages(packages)
if (length(p) > 0) {
print(sprintf("Installing package '%s'.", p))
install.packages(p, dependencies = TRUE, verbose = FALSE, quiet = FALSE)
Expand Down
6 changes: 3 additions & 3 deletions showcase.R
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,10 @@ CF.DATA = "/path/to/codeface-data" # path to codeface data
CF.SELECTION.PROCESS = "threemonth" # releases, threemonth(, testing)

CASESTUDY = "busybox"
ARTIFACT = "feature" # function, feature, file, featureexpression
ARTIFACT = "feature" # function, feature, file, featureexpression (only relevant for cochange)

AUTHOR.RELATION = "mail" # mail, cochange
ARTIFACT.RELATION = "cochange" # cochange, callgraph
AUTHOR.RELATION = "mail" # mail, cochange, issue
ARTIFACT.RELATION = "cochange" # cochange, callgraph, mail, issue


## / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<[email protected]> => 72c8dd25d3dd6d18f46e2b26a5f5b1e2e8dc28d0
<[email protected]> => 5a5ec9675e98187e1e92561e1888aa6f04faa338
<[email protected]> => 3a0ed78458b3976243db6829f63eba3eead26774
<[email protected]>
<[email protected]> <[email protected]> <[email protected]> => 1143db502761379c2bfcecc2007fc34282e7ee61
<[email protected]> => 0a1a5c523d835459c42f33e863623138555e2526 72c8dd25d3dd6d18f46e2b26a5f5b1e2e8dc28d0
4 changes: 3 additions & 1 deletion tests/test-data-cut.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
## Copyright 2017 by Christian Hechtl <[email protected]>
## Copyright 2017 by Felix Prasse <[email protected]>
## Copyright 2018 by Claus Hunsen <[email protected]>
## Copyright 2018 by Barbara Eckl <[email protected]>
## All Rights Reserved.


Expand Down Expand Up @@ -68,7 +69,8 @@ test_that("Cut commit and mail data to same date range.", {
date = get.date.from.string("2016-07-12 16:04:40"),
date.offset = as.integer(c(100)),
subject = c("Re: Fw: busybox 2 tab"),
thread = sprintf("<thread-%s>", c(9)))
thread = sprintf("<thread-%s>", c(9)),
artifact.type = "Mail")

commit.data = x.data$get.data.cut.to.same.date(data.sources = data.sources)$get.commits()
rownames(commit.data) = 1:nrow(commit.data)
Expand Down
125 changes: 125 additions & 0 deletions tests/test-data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
## This file is part of codeface-extraction-r, which is free software: you
## can redistribute it and/or modify it under the terms of the GNU General
## Public License as published by the Free Software Foundation, version 2.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License along
## with this program; if not, write to the Free Software Foundation, Inc.,
## 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
##
## Copyright 2018 by Christian Hechtl <[email protected]>
## Copyright 2018 by Claus Hunsen <[email protected]>
## All Rights Reserved.


context("Tests for ProjectData functionalities.")

##
## Context
##

CF.DATA = file.path(".", "codeface-data")
CF.SELECTION.PROCESS = "testing"
CASESTUDY = "test"
ARTIFACT = "feature"

## use only when debugging this file independently
if (!dir.exists(CF.DATA)) CF.DATA = file.path(".", "tests", "codeface-data")

test_that("Compare two ProjectData objects", {

##initialize a ProjectData object with the ProjectConf and clone it into another one
proj.conf = ProjectConf$new(CF.DATA, CF.SELECTION.PROCESS, CASESTUDY, ARTIFACT)
proj.data.one = ProjectData$new(project.conf = proj.conf)
proj.data.two = proj.data.one$clone()

expect_true(proj.data.one$equals(proj.data.two), info = "Two identical ProjectData objects.")

## Always change one data source in the one object, test for inequality, change it in the
## second object, as well, and test for equality.

##change the second data object
proj.data.one$get.commits()

expect_false(proj.data.one$equals(proj.data.two), "Two not identical ProjectData objects.")

proj.data.two$get.commits()

expect_true(proj.data.one$equals(proj.data.two), "Two identical ProjectData objects.")

proj.data.two$get.pasta()

expect_false(proj.data.one$equals(proj.data.two), "Two not identical ProjectData objects.")

proj.data.one$get.pasta()

expect_true(proj.data.one$equals(proj.data.two), "Two identical ProjectData objects.")

proj.data.two$get.mails()

expect_false(proj.data.one$equals(proj.data.two), "Two not identical ProjectData objects.")

proj.data.one$get.mails()

expect_true(proj.data.one$equals(proj.data.two), "Two identical ProjectData objects.")

proj.data.one$get.issues()

expect_false(proj.data.one$equals(proj.data.two), "Two not identical ProjectData objects.")

proj.data.two$get.issues()

expect_true(proj.data.one$equals(proj.data.two), "Two identical ProjectData objects.")

proj.data.two$get.authors()

expect_false(proj.data.one$equals(proj.data.two), "Two not identical ProjectData objects.")

proj.data.one$get.authors()

expect_true(proj.data.one$equals(proj.data.two), "Two identical ProjectData objects.")

proj.data.one$get.synchronicity()

expect_false(proj.data.one$equals(proj.data.two), "Two not identical ProjectData objects.")

proj.data.two$get.synchronicity()

expect_true(proj.data.one$equals(proj.data.two), "Two identical ProjectData objects.")
})

test_that("Compare two RangeData objects", {

## initialize a ProjectData object with the ProjectConf
## cut it on the base of commits and clone the resulting RangeData object
proj.conf = ProjectConf$new(CF.DATA, CF.SELECTION.PROCESS, CASESTUDY, ARTIFACT)
proj.data.base = ProjectData$new(project.conf = proj.conf)
range.data.one = proj.data.base$get.data.cut.to.same.date("commits")
range.data.two = range.data.one$clone()

## compare the two equal RangeData objects
expect_true(range.data.one$equals(range.data.two))

## cut the ProjectData object on base of issues in order to get another
## RangeData object to check for inequality
range.data.three = proj.data.base$get.data.cut.to.same.date("issues")

expect_false(range.data.one$equals(range.data.three))

## check whether a ProjectData object can be compared to a RangeData object
expect_false(range.data.one$equals(proj.data.base))
expect_false(proj.data.base$equals(range.data.one))

## create a RangeData object with the same data sources as proj.data.base
## and check for inequality
timestamps = proj.data.base$get.data.timestamps(outermost = TRUE)
range.data.four = split.data.time.based(proj.data.base, bins =
c(timestamps[["start"]][[1]], timestamps[["end"]][[1]]))[[1]]

expect_false(proj.data.base$equals(range.data.four))

})
Loading