Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#944] Implement authorship analysis #2140

Merged
merged 161 commits into from
Apr 28, 2024
Merged
Show file tree
Hide file tree
Changes from 109 commits
Commits
Show all changes
161 commits
Select commit Hold shift + click to select a range
b4fc81b
add isAuthorshipAnalyzed flag
SkyBlaise99 Jul 26, 2023
1ea3210
add isAuthorshipAnalyzed to cli arguments
SkyBlaise99 Jul 26, 2023
5e2cdea
reformat cli args
SkyBlaise99 Jul 26, 2023
df5e498
added test cases for args parser
SkyBlaise99 Jul 27, 2023
9b77bda
reformat ArgsParserTest and InputBuilder
SkyBlaise99 Jul 27, 2023
689cdb3
update equals method in cli args
SkyBlaise99 Jul 27, 2023
cb0cf90
pass shouldAnalyzeAuthorship flag from reposense to report generator …
SkyBlaise99 Jul 30, 2023
81b7621
update javadocs for report generator and authorship reporter
SkyBlaise99 Jul 30, 2023
bce2fb2
reformat report generator and authorship reporter
SkyBlaise99 Jul 30, 2023
eb488bc
pass shouldAnalyzeAuthorship flag from authorship reporter to file in…
SkyBlaise99 Aug 1, 2023
e724d47
update javadocs for file info analyzer
SkyBlaise99 Aug 1, 2023
33e393d
add overloading method to fix failing testcases
SkyBlaise99 Aug 1, 2023
72921d7
implement authorship analyzer
SkyBlaise99 Aug 6, 2023
f3fbc33
reformat files
SkyBlaise99 Aug 6, 2023
f7cbe5b
set to default full credit
SkyBlaise99 Aug 7, 2023
03c1a2a
update expected outputs for local repo system tests
SkyBlaise99 Aug 7, 2023
acc6665
update expected outputs for config system tests
SkyBlaise99 Aug 7, 2023
71839dd
update comment
SkyBlaise99 Aug 7, 2023
c8442b8
add test cases for new git methods
SkyBlaise99 Aug 13, 2023
7b69c31
add AuthorshipAnalyzer test cases
SkyBlaise99 Aug 13, 2023
983c784
convert since date to millisec using config's zone id
SkyBlaise99 Aug 13, 2023
dc57517
fix error in obtaining commit time
SkyBlaise99 Aug 13, 2023
cadbeb2
shift getLevenshteinDistance to StringsUtil
SkyBlaise99 Aug 16, 2023
45539c1
fix warnings
SkyBlaise99 Aug 16, 2023
b2d6a08
Merge branch 'master' into 944-analyze-authorship
SkyBlaise99 Aug 17, 2023
397ca6f
store isFullCredit info into segments
SkyBlaise99 Aug 18, 2023
40a7598
store as a single value when the whole segment are all full credit,
SkyBlaise99 Aug 18, 2023
9272869
rename variable
SkyBlaise99 Aug 18, 2023
00c4036
update background for is not full credit
SkyBlaise99 Aug 18, 2023
97f1966
Merge branch 'master' into 944-analyze-authorship
SkyBlaise99 Aug 24, 2023
971cc80
switch to jdk 8 methods
SkyBlaise99 Aug 24, 2023
8293e0f
remove unused imports
SkyBlaise99 Aug 24, 2023
f8a8f30
Merge branch 'master' into 944-analyze-authorship
SkyBlaise99 Sep 1, 2023
489cf6d
fix null pointer exception caused by
SkyBlaise99 Sep 5, 2023
93c0f75
use AY2223S2 tp repo
SkyBlaise99 Sep 12, 2023
613faeb
override equals for candidate line
SkyBlaise99 Sep 12, 2023
f45f81f
prevent frontend from installing everytime
SkyBlaise99 Sep 12, 2023
7a04297
Merge branch '944-test-base' into 944-test-cache
SkyBlaise99 Sep 12, 2023
57018b2
prevent build frontend instead
SkyBlaise99 Sep 12, 2023
48d2c8f
Merge branch '944-test-base' into 944-test-cache
SkyBlaise99 Sep 12, 2023
595b6fd
cache git diff results
SkyBlaise99 Oct 3, 2023
65acc53
compute more then cache
SkyBlaise99 Oct 3, 2023
7736d6f
fix bug in analysis
SkyBlaise99 Oct 8, 2023
6767123
rename key
SkyBlaise99 Oct 11, 2023
857201f
add cache for git log
SkyBlaise99 Oct 11, 2023
174ecc5
Merge branch 'master' into 944-analyze-authorship
SkyBlaise99 Oct 20, 2023
e200e5e
Give partial credit if annotated author is not the same as the blame
SkyBlaise99 Oct 20, 2023
65d8e99
Merge branch '944-analyze-authorship' into 944-test-cache-git-diff-v2…
SkyBlaise99 Oct 23, 2023
bdeb15a
use full and partial credit color
SkyBlaise99 Oct 29, 2023
1b51351
add SimilarityThresholdArgumentType
SkyBlaise99 Oct 29, 2023
2dc14b5
add SIMILARITY_THRESHOLD_FLAGS to ArgsParser
SkyBlaise99 Oct 29, 2023
396df49
pass similarity score down the chain
SkyBlaise99 Oct 29, 2023
e3ee0ed
fix test case
SkyBlaise99 Oct 29, 2023
c079709
Merge branch '944-similarity-threshold-flag' into 944-test-cache-git-…
SkyBlaise99 Oct 29, 2023
a187d9c
Add test cases for annotated author overriding last author's credit
SkyBlaise99 Nov 7, 2023
58b7002
Merge branch 'master' into 944-analyze-authorship
SkyBlaise99 Nov 7, 2023
b296b83
revert merge from master
SkyBlaise99 Nov 7, 2023
4ce6545
revert merge from master 58b70025
SkyBlaise99 Nov 7, 2023
4bd05a7
Trigger workflow
SkyBlaise99 Nov 8, 2023
950c912
Merge branch 'master' into 944-analyze-authorship
SkyBlaise99 Nov 8, 2023
a46d423
Revert "Merge branch 'master' into 944-analyze-authorship"
SkyBlaise99 Nov 8, 2023
bba556d
fix frontend test failing
SkyBlaise99 Nov 8, 2023
a8b0b19
[#944] Fix failing frontend tests (#2068)
SkyBlaise99 Nov 11, 2023
4d7d3aa
Merge branch '944-analyze-authorship' into 944-analyze-authorship
SkyBlaise99 Nov 12, 2023
b827cc8
[#944] Fix wrong credit information inherited by annotated author (#2…
SkyBlaise99 Nov 12, 2023
1b25572
Merge branch '944-swap-color' into 944-analyze-authorship
SkyBlaise99 Nov 12, 2023
9e93961
Merge branch 'reposense:944-analyze-authorship' into 944-analyze-auth…
SkyBlaise99 Nov 12, 2023
086a64b
[#944] Improve visualization for full and partial credit (#2070)
SkyBlaise99 Jan 8, 2024
c85423e
[#944] Change to originality score and new threshold value (#2072)
SkyBlaise99 Jan 8, 2024
896c55a
Merge branch 'reposense:944-analyze-authorship' into 944-analyze-auth…
SkyBlaise99 Jan 8, 2024
e8cb72a
[#944] Differentiate full and partial credit when group is merged (#2…
SkyBlaise99 Jan 27, 2024
39c8058
Merge branch 'reposense:944-analyze-authorship' into 944-analyze-auth…
SkyBlaise99 Jan 31, 2024
d217cbd
Add cache for git log and git diff
SkyBlaise99 Jan 8, 2024
3f933e4
reduce space complexity down to O(min(s, t))
SkyBlaise99 Feb 4, 2024
bbb2dc9
reduce time complexity
SkyBlaise99 Feb 5, 2024
ef2d67d
add early termination
SkyBlaise99 Feb 5, 2024
fb95942
early termination if limit is reached
SkyBlaise99 Feb 5, 2024
f1fb667
early termination
SkyBlaise99 Feb 5, 2024
8f682a8
add cache for git log and git diff
SkyBlaise99 Feb 6, 2024
5e7e1e6
reduce space complexity to O(min(s, t))
SkyBlaise99 Feb 6, 2024
11f16e2
add several early termination
SkyBlaise99 Feb 6, 2024
ba457b3
Merge branch '944-cache' into 944-improve-performance
SkyBlaise99 Feb 6, 2024
551f5d5
Merge branch '944-lev-dist' into 944-improve-performance
SkyBlaise99 Feb 6, 2024
9e9f961
fix checkstyle and update comments
SkyBlaise99 Feb 6, 2024
a621357
add originality threshold flag
SkyBlaise99 Feb 18, 2024
25ef8ca
pass originality threshold param down
SkyBlaise99 Feb 18, 2024
6e33b3d
update analyzeAuthorship to use input originalityThreshold
SkyBlaise99 Feb 18, 2024
c2639d4
Merge branch '944-improve-performance' into 944-merge-conflict
SkyBlaise99 Feb 20, 2024
5a3af8e
Merge branch '944-originality-threshold-flag' into 944-merge-conflict
SkyBlaise99 Feb 20, 2024
1d1a76d
Merge branch 'master' into 944-merge-conflict
SkyBlaise99 Feb 20, 2024
83510f1
remove depreciated file segment.ts
SkyBlaise99 Feb 20, 2024
79209a3
[#944] Add originality threshold flag (#2122)
SkyBlaise99 Mar 3, 2024
fdd2dae
[#944] Improve performance (#2108)
SkyBlaise99 Mar 3, 2024
34b3673
Merge branch 'reposense:944-analyze-authorship' into 944-analyze-auth…
SkyBlaise99 Mar 3, 2024
7290b07
Merge branch '944-analyze-authorship' into 944-merge-conflict
SkyBlaise99 Mar 3, 2024
ef320c0
Merge branch 'master' into 944-merge-conflict
SkyBlaise99 Mar 3, 2024
41900e9
revert some format changes
SkyBlaise99 Mar 3, 2024
e57b81d
cleanup
SkyBlaise99 Mar 3, 2024
37391da
remove argument type test and add test cases into arg parser test
SkyBlaise99 Mar 9, 2024
cbe09ec
Merge branch 'master' into 944-merge-conflict
ckcherry23 Mar 15, 2024
13f0992
use optional as return value
SkyBlaise99 Mar 16, 2024
13c11d4
added FileDiffInfo and extract out getFileDiffInfoList()
SkyBlaise99 Mar 16, 2024
7b0b61e
fix method missing due to low language lvl
SkyBlaise99 Mar 16, 2024
ad53204
fix checkstyle
SkyBlaise99 Mar 16, 2024
e579818
add frontend tests for background colour based on credit
SkyBlaise99 Mar 16, 2024
d07286f
Merge branch 'master' into 944-merge-conflict
SkyBlaise99 Mar 16, 2024
177e408
fix checkstyle
SkyBlaise99 Mar 16, 2024
3a903f2
Restart checks
SkyBlaise99 Mar 17, 2024
d1f3682
set font color to a darker green for better visibility
SkyBlaise99 Mar 17, 2024
ad1e326
update cli in ug
SkyBlaise99 Mar 19, 2024
6149ddc
Merge branch 'master' into 944-merge-conflict
SkyBlaise99 Mar 21, 2024
e2f5219
update docs
SkyBlaise99 Mar 21, 2024
67515e6
Merge branch 'master' into 944-merge-conflict
SkyBlaise99 Mar 27, 2024
3ae5a81
switch to pattern.split
SkyBlaise99 Mar 27, 2024
75bbbf3
Merge branch 'master' into 944-merge-conflict
ckcherry23 Mar 27, 2024
cfa8406
Merge branch 'master' into 944-merge-conflict
SkyBlaise99 Mar 28, 2024
37bc5fa
update background color for full credit author
SkyBlaise99 Mar 28, 2024
d1e39d1
comment to line 22
SkyBlaise99 Mar 28, 2024
3565d85
added type State
SkyBlaise99 Mar 28, 2024
bc475a7
display 'author: username' if full credit, else 'co-author: username'
SkyBlaise99 Mar 28, 2024
5ef3508
fix failing test case
SkyBlaise99 Mar 28, 2024
4fd3d55
update type name
SkyBlaise99 Mar 28, 2024
9151382
added legend
SkyBlaise99 Mar 28, 2024
d3d174c
added grey for merged repos
SkyBlaise99 Mar 28, 2024
3f31547
fix failing test case
SkyBlaise99 Mar 28, 2024
a2d986e
simplify codes
SkyBlaise99 Mar 28, 2024
82aa45b
Merge branch 'master' into 944-merge-conflict
SkyBlaise99 Mar 28, 2024
84bd488
update display
SkyBlaise99 Mar 28, 2024
436c6c9
add separator and remove abc
SkyBlaise99 Mar 29, 2024
8e615f7
move things to GitBlame.java
SkyBlaise99 Mar 31, 2024
cf6aff2
simplify to Objects.equals
SkyBlaise99 Mar 31, 2024
8361c05
add new test cases
SkyBlaise99 Mar 31, 2024
67c4b7e
store raw author info and remove bifunction
SkyBlaise99 Mar 31, 2024
4e647db
Merge branch '944-move-git-blame-line-results' into 944-merge-conflict
SkyBlaise99 Mar 31, 2024
044eb73
remove legend
SkyBlaise99 Mar 31, 2024
d4ab2ec
Merge branch '944-code-cov' into 944-merge-conflict
SkyBlaise99 Apr 1, 2024
92b74fc
update until date
SkyBlaise99 Apr 1, 2024
cd1626f
update date
SkyBlaise99 Apr 1, 2024
fcc499d
Revert "update until date"
SkyBlaise99 Apr 1, 2024
c15de7c
Revert "Merge branch '944-code-cov' into 944-merge-conflict"
SkyBlaise99 Apr 1, 2024
57d650f
add another test
SkyBlaise99 Apr 1, 2024
28db194
Merge branch '944-code-cov' into 944-merge-conflict
SkyBlaise99 Apr 2, 2024
29b8991
Merge branch 'master' into 944-merge-conflict
SkyBlaise99 Apr 2, 2024
91ac2c7
default to partial credit if feature is disabled
SkyBlaise99 Apr 2, 2024
c5908a2
pass isAuthorshipAnalyzed flag to summary json
SkyBlaise99 Apr 2, 2024
cc59ed9
extract isAuthorshipAnalyzed from summary json
SkyBlaise99 Apr 2, 2024
2d32eee
display legend only when isAuthorshipAnalyzed is true
SkyBlaise99 Apr 2, 2024
b3efbfb
display legend contents
SkyBlaise99 Apr 2, 2024
36218b4
fix system tests
SkyBlaise99 Apr 2, 2024
21d16a6
reorder according to alphabetical order
SkyBlaise99 Apr 2, 2024
c9ece51
update default behavior and add explanation
SkyBlaise99 Apr 2, 2024
5074f3e
add brief explanation for contribution
SkyBlaise99 Apr 2, 2024
0e98707
remove text from cli
SkyBlaise99 Apr 2, 2024
c3eea21
remove divider
SkyBlaise99 Apr 2, 2024
e0e5252
Merge branch 'master' into 944-merge-conflict
SkyBlaise99 Apr 17, 2024
eb88218
add default false for isAuthorshipAnalyzed for backwards compatability
SkyBlaise99 Apr 17, 2024
9e90b68
add default false for isFullCredit for backwards compatability with
SkyBlaise99 Apr 17, 2024
d5b187f
Merge branch 'master' into 944-merge-conflict
ckcherry23 Apr 27, 2024
7fa0f99
add empty lines to pass linter check
SkyBlaise99 Apr 28, 2024
d085fdb
fix failing tcs
SkyBlaise99 Apr 28, 2024
acd50b7
fix failing tc
SkyBlaise99 Apr 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ def serveTestReportInBackground = tasks.register('serveTestReportInBackground',
workingDir = 'build/serveTestReport'
main = mainClassName
classpath = sourceSets.main.runtimeClasspath
args = ['--config', './exampleconfig', '--since', 'd1', '--view']
args = ['--config', './exampleconfig', '--since', 'd1', '--view', '-A']
String versionJvmArgs = '-Dversion=' + getRepoSenseVersion()
jvmArgs = [ versionJvmArgs ]
killDescendants = false // Kills descendants of started process using methods only found in Java 9 and beyond.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
describe('credit background colour', () => {
it('check if background colour match the credit information for Eugene', () => {
// open the code panel
cy.get('.icon-button.fa-code')
.should('exist')
.first()
.click();

// full credit - #C8E6C9
cy.get(':nth-child(1) > .file-content > .segment-collection > :nth-child(2) > .java > .code')
.should('have.css', 'background-color')
.and('eq', 'rgb(200, 230, 201)');

// partial credit - #E6FFED
cy.get(':nth-child(1) > .file-content > .segment-collection > :nth-child(4) > .java > .code')
.should('have.css', 'background-color')
.and('eq', 'rgb(230, 255, 237)');
});

it('check if background colour match the credit information when group is merged', () => {
// check merge group checkbox
cy.get('#summary label.merge-group > input')
.should('be.visible')
.check()
.should('be.checked');

// open the code panel
cy.get('.icon-button.fa-code')
.should('exist')
.click();

// full credit - #F0808050
cy.get(':nth-child(1) > .file-content > .segment-collection > :nth-child(7) > .scss > .code')
.should('have.css', 'background-color')
.and('eq', 'rgba(240, 128, 128, 0.314)');

// partial credit - #1E90FF20
cy.get(':nth-child(3) > .file-content > .segment-collection > :nth-child(10) > .java > .code')
.should('have.css', 'background-color')
.and('eq', 'rgba(30, 144, 255, 0.125)');
});
});
12 changes: 9 additions & 3 deletions frontend/src/components/c-segment.vue
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<template lang="pug">
.segment(
v-bind:class="{ untouched: !segment.knownAuthor, active: isOpen }",
v-bind:class="{ untouched: !segment.knownAuthor, active: isOpen, isNotFullCredit: !segment.isFullCredit }",
v-bind:style="{ 'border-left': `0.25rem solid ${authorColors[segment.knownAuthor]}` }",
v-bind:title="`Author: ${segment.knownAuthor || \"Unknown\"}`"
)
Expand Down Expand Up @@ -53,7 +53,7 @@ export default defineComponent({
return {
isOpen: (this.segment.knownAuthor !== null) || this.segment.lines.length < 5 as boolean,
canOpen: (this.segment.knownAuthor === null) && this.segment.lines.length > 4 as boolean,
transparencyValue: '30' as string,
transparencyValue: (this.segment.isFullCredit ? '50' : '20') as string,
};
},
computed: {
Expand All @@ -77,7 +77,7 @@ export default defineComponent({
border-left: .25rem solid mui-color('green');

.code {
background-color: mui-color('github', 'authored-code-background');
background-color: mui-color('github', 'full-authored-code-background');
padding-left: 1rem;
}

Expand Down Expand Up @@ -110,6 +110,12 @@ export default defineComponent({
word-break: break-word;
}

&.isNotFullCredit {
.code {
background-color: mui-color('github', 'partial-authored-code-background');
}
}

&.untouched {
$grey: mui-color('grey', '400');
border-left: .25rem solid $grey;
Expand Down
3 changes: 2 additions & 1 deletion frontend/src/styles/_colors.scss
Original file line number Diff line number Diff line change
Expand Up @@ -303,7 +303,8 @@ $mui-colors: (
'github': (
'title-background': #FAFBFC,
'border': #E1E4E8,
'authored-code-background': #E6FFED,
'full-authored-code-background': #C8E6C9,
SkyBlaise99 marked this conversation as resolved.
Show resolved Hide resolved
'partial-authored-code-background': #E6FFED,
),
'grey': (
'50': #FAFAFA,
Expand Down
2 changes: 1 addition & 1 deletion frontend/src/styles/hightlight-js-style.css
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@

.hljs-section,
.hljs-name {
color: #63a35c;
color: #468C5A;
}

.hljs-tag {
Expand Down
1 change: 1 addition & 0 deletions frontend/src/types/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ export interface Repo extends RepoRaw {

export interface AuthorshipFileSegment {
knownAuthor: string | null;
isFullCredit: boolean;
lineNumbers: number[];
lines: string[];
}
Expand Down
1 change: 1 addition & 0 deletions frontend/src/types/zod/authorship-type.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ const lineSchema = z.object({
lineNumber: z.number(),
author: z.object({ gitId: z.string() }),
content: z.string(),
isFullCredit: z.boolean(),
});

const fileResult = z.object({
Expand Down
6 changes: 5 additions & 1 deletion frontend/src/views/c-authorship.vue
Original file line number Diff line number Diff line change
Expand Up @@ -411,6 +411,7 @@ export default defineComponent({
splitSegments(lines: Array<Line>): { segments: Array<AuthorshipFileSegment>; blankLineCount: number; } {
// split into segments separated by knownAuthor
let lastState: string | null;
let lastCreditState: boolean;
let lastId = -1;
const segments: Array<AuthorshipFileSegment> = [];
let blankLineCount = 0;
Expand All @@ -420,16 +421,19 @@ export default defineComponent({
? !this.isUnknownAuthor(line.author.gitId)
: line.author.gitId === this.info.author;
const knownAuthor = (line.author && isAuthorMatched) ? line.author.gitId : null;
const isFullCredit = line.isFullCredit;

if (knownAuthor !== lastState || lastId === -1) {
if (knownAuthor !== lastState || lastId === -1 || (knownAuthor && isFullCredit !== lastCreditState)) {
segments.push({
knownAuthor,
isFullCredit,
lineNumbers: [],
lines: [],
});

lastId += 1;
lastState = knownAuthor;
lastCreditState = isFullCredit;
}

const content = line.content || ' ';
Expand Down
3 changes: 2 additions & 1 deletion src/main/java/reposense/RepoSense.java
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,8 @@ public static void main(String[] args) {
cliArguments.getSinceDate(), cliArguments.getUntilDate(),
cliArguments.isSinceDateProvided(), cliArguments.isUntilDateProvided(),
cliArguments.getNumCloningThreads(), cliArguments.getNumAnalysisThreads(),
TimeUtil::getElapsedTime, cliArguments.getZoneId(), cliArguments.isFreshClonePerformed());
TimeUtil::getElapsedTime, cliArguments.getZoneId(), cliArguments.isFreshClonePerformed(),
cliArguments.isAuthorshipAnalyzed(), cliArguments.getOriginalityThreshold());

FileUtil.zipFoldersAndFiles(reportFoldersAndFiles, cliArguments.getOutputFilePath().toAbsolutePath(),
".json");
Expand Down
12 changes: 7 additions & 5 deletions src/main/java/reposense/authorship/AuthorshipReporter.java
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
import reposense.model.RepoConfiguration;
import reposense.system.LogsManager;


/**
* Generates the authorship summary data for each repository.
*/
Expand All @@ -29,24 +28,27 @@ public class AuthorshipReporter {
private final FileInfoAnalyzer fileInfoAnalyzer = new FileInfoAnalyzer();
private final FileResultAggregator fileResultAggregator = new FileResultAggregator();


/**
* Generates and returns the authorship summary for each repo in {@code config}.
* Further analyzes the authorship of each line in the commit if {@code shouldAnalyzeAuthorship} is true, based on
* {code originalityThreshold}.
*/
public AuthorshipSummary generateAuthorshipSummary(RepoConfiguration config) {
public AuthorshipSummary generateAuthorshipSummary(RepoConfiguration config, boolean shouldAnalyzeAuthorship,
double originalityThreshold) {
List<FileInfo> textFileInfos = fileInfoExtractor.extractTextFileInfos(config);

int numFiles = textFileInfos.size();
int totalNumLines = textFileInfos.stream()
.mapToInt(fileInfo -> fileInfo.getNumOfLines())
.mapToInt(FileInfo::getNumOfLines)
.sum();

if (totalNumLines > HIGH_NUMBER_LINES_THRESHOLD) {
logger.warning(String.format(HIGH_NUMBER_LINES_MESSAGE, numFiles, totalNumLines));
}

List<FileResult> fileResults = textFileInfos.stream()
.map(fileInfo -> fileInfoAnalyzer.analyzeTextFile(config, fileInfo))
.map(fileInfo -> fileInfoAnalyzer.analyzeTextFile(config, fileInfo, shouldAnalyzeAuthorship,
originalityThreshold))
.filter(Objects::nonNull)
.collect(Collectors.toList());

Expand Down
38 changes: 26 additions & 12 deletions src/main/java/reposense/authorship/FileInfoAnalyzer.java
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import java.util.logging.Logger;

import reposense.authorship.analyzer.AnnotatorAnalyzer;
import reposense.authorship.analyzer.AuthorshipAnalyzer;
import reposense.authorship.model.FileInfo;
import reposense.authorship.model.FileResult;
import reposense.authorship.model.LineInfo;
Expand Down Expand Up @@ -44,10 +45,13 @@ public class FileInfoAnalyzer {
/**
* Analyzes the lines of the file, given in the {@code fileInfo}, that has changed in the time period provided
* by {@code config}.
* Further analyzes the authorship of each line in the commit if {@code shouldAnalyzeAuthorship} is true, based on
* {@code originalityThreshold}.
* Returns null if the file is missing from the local system, or none of the
* {@link Author} specified in {@code config} contributed to the file in {@code fileInfo}.
*/
public FileResult analyzeTextFile(RepoConfiguration config, FileInfo fileInfo) {
public FileResult analyzeTextFile(RepoConfiguration config, FileInfo fileInfo, boolean shouldAnalyzeAuthorship,
double originalityThreshold) {
String relativePath = fileInfo.getPath();

if (Files.notExists(Paths.get(config.getRepoRoot(), relativePath))) {
Expand All @@ -59,10 +63,10 @@ public FileResult analyzeTextFile(RepoConfiguration config, FileInfo fileInfo) {
return null;
georgetayqy marked this conversation as resolved.
Show resolved Hide resolved
}

aggregateBlameAuthorModifiedAndDateInfo(config, fileInfo);
aggregateBlameAuthorModifiedAndDateInfo(config, fileInfo, shouldAnalyzeAuthorship, originalityThreshold);
fileInfo.setFileType(config.getFileType(fileInfo.getPath()));

AnnotatorAnalyzer.aggregateAnnotationAuthorInfo(fileInfo, config.getAuthorConfig());
AnnotatorAnalyzer.aggregateAnnotationAuthorInfo(fileInfo, config.getAuthorConfig(), shouldAnalyzeAuthorship);

if (!config.getAuthorList().isEmpty() && fileInfo.isAllAuthorsIgnored(config.getAuthorList())) {
return null;
Expand Down Expand Up @@ -100,9 +104,8 @@ private FileResult generateTextFileResult(FileInfo fileInfo) {
authorContributionMap.put(author, authorContributionMap.getOrDefault(author, 0) + 1);
}

return FileResult.createTextFileResult(
fileInfo.getPath(), fileInfo.getFileType(), fileInfo.getLines(), authorContributionMap,
fileInfo.exceedsFileLimit());
return FileResult.createTextFileResult(fileInfo.getPath(), fileInfo.getFileType(), fileInfo.getLines(),
authorContributionMap, fileInfo.exceedsFileLimit());
}

/**
Expand Down Expand Up @@ -139,8 +142,11 @@ private FileResult generateBinaryFileResult(RepoConfiguration config, FileInfo f
* The {@code config} is used to obtain the root directory for running git blame as well as other parameters used
* in determining which author to assign to each line and whether to set the last modified date for a
* {@code lineInfo}.
* Further analyzes the authorship of each line in the commit if {@code shouldAnalyzeAuthorship} is true, based on
* {@code originalityThreshold}.
*/
private void aggregateBlameAuthorModifiedAndDateInfo(RepoConfiguration config, FileInfo fileInfo) {
private void aggregateBlameAuthorModifiedAndDateInfo(RepoConfiguration config, FileInfo fileInfo,
boolean shouldAnalyzeAuthorship, double originalityThreshold) {
String blameResults;

if (!config.isFindingPreviousAuthorsPerformed()) {
Expand All @@ -159,14 +165,15 @@ private void aggregateBlameAuthorModifiedAndDateInfo(RepoConfiguration config, F
String authorName = blameResultLines[lineCount + 1].substring(AUTHOR_NAME_OFFSET);
String authorEmail = blameResultLines[lineCount + 2]
.substring(AUTHOR_EMAIL_OFFSET).replaceAll("<|>", "");
Long commitDateInMs = Long.parseLong(blameResultLines[lineCount + 3].substring(AUTHOR_TIME_OFFSET)) * 1000;
long commitDateInMs = Long.parseLong(blameResultLines[lineCount + 3].substring(AUTHOR_TIME_OFFSET)) * 1000;
LocalDateTime commitDate = LocalDateTime.ofInstant(Instant.ofEpochMilli(commitDateInMs),
config.getZoneId());
Author author = config.getAuthor(authorName, authorEmail);

if (!fileInfo.isFileLineTracked(lineCount / 5) || author.isIgnoringFile(filePath)
int lineNumber = lineCount / 5;
SkyBlaise99 marked this conversation as resolved.
Show resolved Hide resolved
if (!fileInfo.isFileLineTracked(lineNumber) || author.isIgnoringFile(filePath)
|| CommitHash.isInsideCommitList(commitHash, config.getIgnoreCommitList())
|| commitDate.compareTo(sinceDate) < 0 || commitDate.compareTo(untilDate) > 0) {
|| commitDate.isBefore(sinceDate) || commitDate.isAfter(untilDate)) {
SkyBlaise99 marked this conversation as resolved.
Show resolved Hide resolved
author = Author.UNKNOWN_AUTHOR;
}

Expand All @@ -176,9 +183,16 @@ private void aggregateBlameAuthorModifiedAndDateInfo(RepoConfiguration config, F
MESSAGE_SHALLOW_CLONING_LAST_MODIFIED_DATE_CONFLICT, config.getRepoName()));
}

fileInfo.setLineLastModifiedDate(lineCount / 5, commitDate);
fileInfo.setLineLastModifiedDate(lineNumber, commitDate);
}
fileInfo.setLineAuthor(lineNumber, author);

if (shouldAnalyzeAuthorship && !author.equals(Author.UNKNOWN_AUTHOR)) {
String lineContent = fileInfo.getLine(lineNumber + 1).getContent();
boolean isFullCredit = AuthorshipAnalyzer.analyzeAuthorship(config, fileInfo.getPath(), lineContent,
commitHash, author, originalityThreshold);
fileInfo.setIsFullCredit(lineNumber, isFullCredit);
}
fileInfo.setLineAuthor(lineCount / 5, author);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,10 @@ public class AnnotatorAnalyzer {
*
* @param fileInfo FileInfo to be further analyzed with author annotations.
* @param authorConfig AuthorConfiguration for current analysis.
* @param shouldAnalyzeAuthorship whether credit info needs to be overwritten.
*/
public static void aggregateAnnotationAuthorInfo(FileInfo fileInfo, AuthorConfiguration authorConfig) {
public static void aggregateAnnotationAuthorInfo(FileInfo fileInfo, AuthorConfiguration authorConfig,
boolean shouldAnalyzeAuthorship) {
Optional<Author> currentAnnotatedAuthor = Optional.empty();
Path filePath = Paths.get(fileInfo.getPath());
for (LineInfo lineInfo : fileInfo.getLines()) {
Expand All @@ -60,15 +62,15 @@ public static void aggregateAnnotationAuthorInfo(FileInfo fileInfo, AuthorConfig
boolean isUnknownAuthorSegment = !currentAnnotatedAuthor.isPresent() && !newAnnotatedAuthor.isPresent();

if (isEndOfAnnotatedSegment) {
lineInfo.setAuthor(currentAnnotatedAuthor.get());
lineInfo.updateAuthorAndCredit(currentAnnotatedAuthor.get(), shouldAnalyzeAuthorship);
currentAnnotatedAuthor = Optional.empty();
} else if (isUnknownAuthorSegment) {
currentAnnotatedAuthor = Optional.of(Author.UNKNOWN_AUTHOR);
} else {
currentAnnotatedAuthor = newAnnotatedAuthor.filter(author -> !author.isIgnoringFile(filePath));
}
}
currentAnnotatedAuthor.ifPresent(lineInfo::setAuthor);
currentAnnotatedAuthor.ifPresent(author -> lineInfo.updateAuthorAndCredit(author, shouldAnalyzeAuthorship));
}
}

Expand Down
Loading
Loading