An exploratory research of agile methods - 0.0.1 (working paper)

Eduard BUDACU

Change log

  • 2016-11-21 initial version (articles are imported from the webservice, wordcount and tf-idf computed, documents are retrieved based on similarity, resources are presented)
In [1]:
import graphlab
In [2]:
artdb = graphlab.SFrame.read_json('http://agileresearch-sciencedb.azurewebsites.net/articles')
This non-commercial license of GraphLab Create for academic use is assigned to eduard.budacu@csie.ase.ro and will expire on November 20, 2017.
[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: C:\Users\EDUARD~1.BUD\AppData\Local\Temp\graphlab_server_1479727341.log.0
Downloading http://agileresearch-sciencedb.azurewebsites.net/articles to C:/Users/EDUARD~1.BUD/AppData/Local/Temp/graphlab-eduard.budacu/3668/119f68e8-fdc4-4f60-abb4-083d74fa0c49
Finished parsing file http://agileresearch-sciencedb.azurewebsites.net/articles
Parsing completed. Parsed 1 lines in 0.026038 secs.
In [8]:
len(artdb)
Out[8]:
489

Explore the data set

In [4]:
artdb.head()
Out[4]:
abstract authors id keywords
Abstract Small, self-
directed teams are ...
Yngve Lindsjorn and Dag
I.K. Sjoberg and Torgeir ...
1 Agile development,Project
management,Team ...
Abstract Agile methods in
software development ...
Taghi Javdani Gandomani
and Mina Ziaei Nafchi ...
2 Agile software
development,Agile ...
Abstract The growing
interest in Agile and ...
Indira Nurdiani and
Jurgen Borstler and ...
3 Tertiary study,Agile
software development, ...
AbstractContext Combining
software architecture ...
Chen Yang and Peng Liang
and Paris Avgeriou ...
4 Software
architecture,Agile ...
Abstract The mainstream
research into project ...
Jose Adson O.G. Cunha and
Hermano P. Moura and ...
5 Software Project
Management,Naturalistic ...
Abstract The relationship
between customers and ...
Torgeir Dingsoyr and
Casper Lassenius ...
6 Agile software
development,Software ...
Abstract Context: The
global software industry ...
Vahid Garousi and Kai
Petersen and Baris Ozkan ...
7 Software engineering
,Industry-academia co ...
Abstract The disruptive
nature of the antifra ...
Daniel Russo and Paolo
Ciancarini ...
8 Complex Systems,Software
Engineering,Antifragi ...
AbstractContext Agile
approaches are an ...
C.J. Torrecilla-Salinas
and J. Sedeno and M.J. ...
9 Agile,Scrum,Web
Engineering,CMMI,Soft ...
Abstract Considerable
attention has been paid ...
Ezequiel Scott and
Guillermo Rodriguez and ...
10 Agile software
development,Software ...
title url
Teamwork quality and
project success in ...
http://www.sciencedirect.
com/science/article/p ...
Agile transition and
adoption human-related ...
http://www.sciencedirect.
com/science/article/p ...
The impacts of agile and
lean practices on pro ...
http://www.sciencedirect.
com/science/article/p ...
A systematic mapping
study on the combination ...
http://www.sciencedirect.
com/science/article/p ...
Decision-making in
Software Project ...
http://www.sciencedirect.
com/science/article/p ...
Emerging themes in agile
software development: ...
http://www.sciencedirect.
com/science/article/p ...
Challenges and best
practices in industry- ...
http://www.sciencedirect.
com/science/article/p ...
A Proposal for an
Antifragile Software ...
http://www.sciencedirect.
com/science/article/p ...
Agile, Web Engineering
and Capability Maturity ...
http://www.sciencedirect.
com/science/article/p ...
Towards better Scrum
learning using learning ...
http://www.sciencedirect.
com/science/article/p ...
[10 rows x 6 columns]

Compute the word count for the title field

In [5]:
artdb['word_count'] = graphlab.text_analytics.count_words(artdb['title'])
artdb['word_count']
Out[5]:
dtype: dict
Rows: 489
[{'development': 1L, 'a': 1L, 'and': 1L, 'success': 1L, 'development:': 1L, 'agile': 1L, 'in': 1L, 'teams': 1L, 'project': 1L, 'teamwork': 1L, 'of': 1L, 'survey': 1L, 'quality': 1L, 'software': 1L}, {'and': 2L, 'human-related': 1L, 'theory': 1L, 'grounded': 1L, 'agile': 1L, 'transition': 1L, 'challenges': 1L, 'adoption': 1L, 'issues:': 1L, 'a': 1L, 'approach': 1L}, {'a': 1L, 'and': 1L, 'impacts': 1L, 'on': 1L, 'of': 1L, 'study': 1L, 'lean': 1L, 'project': 1L, 'practices': 1L, 'agile': 1L, 'the': 1L, 'constraints:': 1L, 'tertiary': 1L}, {'development': 1L, 'and': 1L, 'combination': 1L, 'on': 1L, 'agile': 1L, 'study': 1L, 'mapping': 1L, 'a': 1L, 'systematic': 1L, 'of': 1L, 'the': 1L, 'software': 1L, 'architecture': 1L}, {'a': 1L, 'literature': 1L, 'review': 1L, 'decision-making': 1L, 'project': 1L, 'systematic': 1L, 'in': 1L, 'management:': 1L, 'software': 1L}, {'on': 1L, 'emerging': 1L, 'themes': 1L, 'to': 1L, 'development:': 1L, 'agile': 1L, 'section': 1L, 'continuous': 1L, 'value': 1L, 'delivery': 1L, 'introduction': 1L, 'in': 1L, 'the': 1L, 'special': 1L, 'software': 1L}, {'and': 1L, 'a': 1L, 'literature': 1L, 'collaborations': 1L, 'industry-academia': 1L, 'review': 1L, 'challenges': 1L, 'practices': 1L, 'systematic': 1L, 'in': 2L, 'engineering:': 1L, 'best': 1L, 'software': 1L}, {'a': 1L, 'for': 1L, 'an': 1L, 'manifesto': 1L, 'antifragile': 1L, 'proposal': 1L, 'software': 1L}, {'and': 1L, 'web': 1L, 'literature': 1L, 'integration:': 1L, 'review.': 1L, 'agile,': 1L, 'capability': 1L, 'maturity': 1L, 'engineering': 1L, 'a': 1L, 'systematic': 1L, 'model': 1L}, {'styles': 1L, 'towards': 1L, 'scrum': 1L, 'better': 1L, 'learning': 2L, 'using': 1L}, {'challenges': 1L, 'that': 1L, "practitioners'": 1L, 'agile': 1L, 'engaging': 1L, 'concerns': 1L, 'challenge:': 1L, 'the': 1L, 'with': 1L}, {'a': 1L, 'multi-level': 1L, 'management': 1L, 'agile': 1L, 'self-organizing': 1L, 'project': 1L, 'perspective': 1L, 'team': 1L, 'challenges:': 1L}, {'a': 1L, 'and': 1L, 'use': 1L, 'requirements': 2L, 'of': 2L, 'study': 1L, 'as': 1L, 'engineering': 1L, 'multi-case': 1L, 'agile': 1L, 'test': 1L, 'cases': 1L, 'the': 1L}, {'a': 1L, 'delivery': 1L, 'finnish': 1L, 'of': 1L, 'study': 1L, 'enterprises': 1L, 'intensive': 1L, 'cycle:': 1L, 'multiple-case': 1L, 'in': 1L, 'improving': 1L, 'the': 2L, 'toolchains': 1L, 'software': 1L}, {'turkey': 1L, 'versus': 1L, 'practitioner': 1L, 'demographics:': 1L, 'exploratory': 1L, 'of': 1L, 'study': 1L, 'cross-factor': 1L, 'analysis': 1L, 'an': 1L, 'engineering': 1L, 'practices': 1L, 'in': 1L, 'software': 1L}, {'a': 1L, 'on': 1L, 'ubiquitous': 1L, 'for': 1L, 'of': 1L, 'review': 1L, 'engineering': 1L, 'systems': 1L, 'systematic': 1L, 'the': 1L, 'software': 1L}, {'and': 1L, 'development': 1L, 'artefacts': 1L, 'large-scale': 1L, 'agile': 1L, 'tailoring': 1L, 'in': 1L, 'programmes': 1L, 'method': 1L, 'offshore': 1L, 'software': 1L}, {'development': 1L, 'and': 1L, 'agile': 1L, 'study': 1L, 'of': 2L, 'interdependencies': 1L, 'teams': 1L, 'stability': 1L, 'source': 1L, 'as': 1L, 'a': 2L, 'routine': 1L, 'flexibility.': 1L, 'software': 1L}, {'and': 1L, '10': 1L, 'knowledge': 1L, 'of': 1L, 'management:': 1L, 'practice': 1L, 'years': 1L, 'future': 1L, 'architecture': 1L, 'software': 1L}, {'a': 1L, 'on': 1L, 'google': 1L, 'in': 1L, 'secure': 1L, 'agile': 1L, 'industry': 1L, 'current': 1L, 'forms': 1L, 'of': 1L, 'using': 1L, 'the': 1L, 'limitations': 1L, 'study': 1L, 'methods': 1L}, {'lessons': 1L, 'and': 1L, 'product': 1L, 'from': 1L, 'business': 1L, 'model:': 1L, 'managing': 1L, 'agile': 1L, 'transition': 1L, 'to': 1L, 'development': 1L, 'systems': 1L, 'new': 1L, 'cisco': 1L, 'the': 2L}, {'a': 1L, 'meeting:': 1L, 'theory': 1L, 'grounded': 1L, 'study': 1L, 'daily': 1L, 'the': 1L, 'stand-up': 1L}, {'a': 1L, 'and': 2L, 'direction': 1L, 'business': 1L, 'science': 1L, 'review': 1L, 'agile,': 1L, 'analytics': 1L, 'future': 1L, 'intelligence,': 1L, 'of': 1L, 'data': 1L}, {'development': 1L, 'a': 1L, 'agile': 1L, 'communication': 1L, 'review': 1L, 'distributed': 1L, 'empirical': 1L, 'geographically': 1L, 'systematic': 1L, 'of': 1L, 'studies': 1L, 'challenges:': 1L}, {'a': 1L, 'management': 1L, 'government:': 1L, 'agile': 1L, 'innovation': 1L, 'research': 1L, 'agenda': 1L, 'in': 1L}, {'and': 1L, 'a': 1L, 'literature': 1L, 'success': 1L, 'for': 1L, 'factors': 1L, 'agile': 1L, 'review': 1L, 'large-scale': 1L, 'challenges': 1L, 'transformations:': 1L, 'systematic': 1L}, {'a': 1L, 'measuring': 1L, 'large-scale': 1L, 'agile': 1L, 'quantitatively': 1L, 'transformation': 1L}, {'industrial': 1L, 'methods': 1L, 'embedded': 1L, 'agile': 1L, 'study': 1L, 'development:': 1L, 'system': 1L, 'three': 1L, 'multiple-case': 1L, 'of': 1L, 'in': 1L, 'cases': 1L}, {'for': 1L, 'working': 1L, 'space': 1L, 'agile': 1L, 'team': 1L, '(scrum)': 1L, 'project': 1L, 'of': 1L, 'conceptual': 1L, 'model': 1L}, {'a': 1L, 'web-based': 1L, 'and': 1L, 'for': 1L, 'scenario': 1L, 'urban': 1L, 'system': 1L, 'modelling': 1L, 'visualisation': 1L, 'precinct': 1L, 'assessment': 1L, '3d': 1L}, {'and': 1L, 'development': 1L, 'product': 1L, 'turning': 1L, 'world:': 1L, 'agile': 1L, 'into': 1L, 'weaknesses': 1L, 'traditional': 1L, 'hyperconnected': 1L, 'a': 1L, 'in': 1L, 'strengths': 1L}, {'development': 1L, 'exploratory': 1L, 'agile': 1L, 'study': 1L, 'global': 1L, 'communication': 1L, 'an': 1L, 'in': 2L, 'software': 1L}, {'on': 1L, 'agility': 1L, 'management': 1L, 'theory': 1L, 'construct': 1L, 'project': 1L, 'the': 1L}, {'operations': 1L, 'a': 1L, 'for': 1L, 'algorithm': 1L, 'agile': 1L, 'new': 1L, 'satellite-based': 1L, 'acquisition': 1L}, {}, {'management': 1L, 'identifying': 1L, 'of': 1L, 'profession': 1L, 'project': 1L, 'state': 1L, 'the': 2L}, {'information': 1L, 'processing': 1L, 'casting': 1L, 'for': 1L, 'pressure': 1L, 'agile': 1L, 'modern': 1L, 'an': 1L, 'high': 1L, 'applications': 1L, 'systems': 1L, 'framework': 1L, 'in': 1L, 'die': 1L, 'manufacturing': 1L}, {'turn-around': 1L, 'agile': 1L, '\{uav\}': 1L, 'an': 1L, 'fixed-wing': 1L, 'manoeuvres': 1L, 'aggressive': 1L, 'with': 1L}, {'and': 1L, 'in': 1L, 'for': 1L, 'facilities': 1L, 'ramp-up': 1L, 'time': 1L, 'reduce': 1L, 'an': 1L, 'to': 1L, 'production': 2L, 'automated': 1L, 'commissioning': 1L, 'approach': 1L, 'multi-variant': 1L}, {'and': 2L, 'the': 2L, 'to': 1L, 'of': 2L, 'workers': 1L, 'ebola': 1L, 'experience': 1L, 'public': 1L, 'analytics': 1L, 'health': 2L, 'systems': 1L, 'during': 1L, 'empower': 1L, 'using': 1L, 'technology': 1L, 'simulation': 1L, 'integrity': 1L, 'crisis': 1L, 'frontline': 1L, 'improve': 1L}, {'radios': 1L, 'for': 1L, '\{ofdm\}': 1L, 'mask': 1L, 'emission': 1L, 'spectrally': 1L, 'efficient': 1L, 'shaping': 1L, 'cognitive': 1L}, {'development': 1L, 'product': 1L, 'iterative': 1L, 'highly': 1L, 'prototyping': 1L, 'target-oriented': 1L, 'in': 1L}, {'a': 1L, 'and': 2L, 'coupling': 1L, 'for': 1L, 'models': 1L, 'with': 1L, 'diffusion': 1L, 'local': 1L, 'nonlocal': 1L, 'strategy': 1L, 'volume': 1L, 'mixed': 1L, 'boundary': 1L, 'conditions': 1L, 'constraints': 1L}, {'exploring': 1L, 'anti-patterns': 1L, 'study': 1L, 'scrum': 1L, 'empirical': 1L, 'scrumbut--an': 1L, 'of': 1L}, {'and': 1L, 'a': 1L, 'paradigms': 1L, 'business': 1L, 'of': 1L, 'perspective:': 1L, 'integration': 1L, 'agile,': 1L, 'lean,': 1L, 'green': 1L, 'in': 1L, 'resilient': 1L, 'foundations': 1L, 'model': 1L, 'theoretical': 1L}, {'a': 1L, 'development': 1L, 'and': 1L, 'external': 1L, 'on': 1L, 'of': 1L, 'review': 1L, 'productivity:': 1L, 'driven': 1L, 'internal': 1L, 'effects': 1L, 'systematic': 1L, 'test': 1L, 'the': 1L, 'quality': 1L, 'quality,': 1L}, {'and': 1L, 'use': 1L, 'management': 2L, 'agile': 1L, 'big': 1L, 'data': 1L, 'project': 1L, 'in': 1L, 'approach': 1L, 'its': 1L}, {'and': 1L, 'the': 2L, 'battery': 1L, 'an': 1L, 'as': 1L, 'for': 1L, 'lean': 1L, 'fast': 1L, 'to': 1L, 'development': 1L, 'product': 1L, 'return': 1L, 'e-mobility': 1L, 'using': 1L, 'with': 1L, 'optimize': 1L, 'a': 1L, 'of': 2L, 'engineering': 1L, 'enabler': 1L, 'example': 1L, 'lithium-ion': 1L}, {'and': 1L, 'plant': 1L, 'towards': 1L, 'agile': 1L, 'in': 1L, 'mechatronic': 1L, 'engineering': 1L, 'construction': 1L, 'machinery': 1L, 'of': 1L, 'systems': 1L}, {'a': 1L, 'on': 1L, 'product': 1L, 'management': 1L, 'in': 1L, 'conferences': 1L, 'for': 1L, '\{ifac\}': 1L, 'agile': 1L, 'papers': 1L, 'transformation': 1L, 'preparation': 1L, 'framework': 1L, 'symposia:': 1L, 'rapid': 1L, 'of': 2L, 'based': 1L, '&': 1L, 'data': 2L, 'lifecycle': 1L, 'view': 1L}, {'a': 1L, 'and': 2L, 'continuous': 1L, 'business': 1L, 'for': 1L, 'of': 1L, 'combining': 1L, 'modelling': 1L, 'it:': 1L, 'architecture': 1L, 'enterprise': 2L, 'new': 1L, 'the': 1L, 'ontology': 1L, 'alignment': 1L, 'paradigm': 1L}, {'and': 2L, 'systems:': 1L, 'business': 2L, 'for': 1L, 'of': 1L, 'influence': 1L, 'experience': 1L, 'objects': 1L, 'decomposition': 1L, 'components': 1L, 'characteristics': 1L, 'modeling': 1L, 'the': 1L, 'analyst': 1L, 'reusing': 1L}, {'a': 1L, 'on': 1L, 'benefits': 1L, 'success': 1L, 'characteristics': 1L, 'in': 1L, 'delivering': 1L, 'client': 1L, 'survey': 1L, 'of': 1L, 'the': 1L, 'with': 1L, 'projects': 1L}, {'profile': 1L, 'and': 1L, 'scrum': 1L, 'in': 1L, 'process': 1L, 'andromda': 1L, 'testing': 2L, 'driven': 1L, '\{uml2\}': 1L, 'automated': 1L, 'using': 1L, 'model': 1L}, {'among': 1L, 'business': 1L, 'process': 1L, 'better': 1L, 'to': 1L, 'models': 1L, 'stories': 1L, 'user': 1L, 'dependencies': 1L, 'using': 1L, 'the': 1L, 'understand': 1L}, {'information': 1L, 'agility': 1L, '(plis)': 1L, 'for': 1L, 'of': 1L, 'measures': 1L, 'quantified': 1L, 'production': 1L, 'towards': 1L, 'systems': 1L, 'line': 1L}, {'and': 2L, 'network': 1L, 'evaluation': 1L, 'principles': 1L, 'data': 1L, 'design': 1L, 'orchestration': 1L, 'in': 1L, 'performance': 1L, '\{sdn\}': 1L, 'centers:': 1L, 'cloud': 1L}, {'from': 1L, 'knowledge': 1L, 'that': 1L, 'approach': 1L, 'interactions': 1L, 'an': 1L, 'extraction': 1L, 'design': 1L, 'ktr:': 1L, 'supports': 1L}, {'a': 1L, 'and': 1L, 'towards': 1L, 'government': 1L, 'stable,': 1L, 'accountable': 1L, 'responsive': 1L, 'adaptive': 1L, 'governance:': 1L}, {'enhance': 1L, 'effective': 1L, 'tio2': 1L, 'photocatalytic': 1L, 'of': 2L, 'prepared': 1L, 'modified': 1L, 'by': 1L, 'f-doped': 1L, 'to': 1L, 'sol-gel': 1L, 'trifluoroacetic': 1L, '(tfa)': 1L, 'activity': 1L, 'role': 1L, 'the': 1L, 'acid': 1L, 'method': 1L}, {'development': 1L, 'supporting': 1L, 'product': 1L, 'networked': 1L, 'network': 1L, 'for': 1L, 'dynamic': 1L, 'system': 1L, 'factory': 1L, 'digital': 1L, 'collaborative': 1L, 'manufacturing': 1L}, {'and': 1L, 'on': 1L, 'agility': 1L, 'perceptions': 1L, 'creating': 1L, 'service': 2L, 'organizations': 1L, 'of': 1L, '\{it\}': 3L, 'influence': 1L, 'internal': 1L, 'it:': 1L, 'agile': 1L, 'through': 1L, 'the': 1L, 'quality': 1L}, {'and': 1L, 'conflicting': 1L, 'complementary': 1L, 'controls': 1L, 'information': 1L, 'control': 1L, 'development': 1L, 'alignment:': 1L, 'systems': 2L}, {'a': 1L, 'and': 1L, 'decentralized': 1L, 'for': 1L, 'of': 1L, 'extensible': 1L, 'modular': 1L, 'production': 1L, 'design': 1L, 'systems': 1L, 'flexible': 1L, 'architecture': 1L}, {'case': 1L, 'and': 1L, 'hotel': 1L, 'monitoring': 1L, 'of': 2L, 'study': 1L, "destination's": 1L, 'bilbao': 1L, 'performance': 1L, 'industry:': 1L, 'a': 1L, 'in': 1L, '2014': 1L, 'the': 2L, 'benchmarking': 1L}, {'and': 1L, 'functionalities': 1L, 'multi-cloud': 1L, 'approaches': 1L, 'platform-as-a-service': 1L, 'model,': 1L}, {'a': 1L, 'and': 1L, 'industrial': 1L, 'for': 1L, 'approach': 1L, 'designing': 1L, 'programming': 1L, 'arms': 1L, 'reconfigurable': 1L, 'assembling': 1L, 'robotic': 1L, 'stochastic': 1L, 'robots:': 1L}, {'time-driven': 1L, 'academic': 1L, 'libraries': 1L, 'to': 1L, 'practices': 1L, 'identify': 1L, 'costing': 1L, 'in': 1L, 'using': 1L, 'best': 1L, 'activity-based': 1L}, {'a': 1L, 'secured': 1L, 'towards': 1L, 'virtualization': 1L, 'network': 1L}, {'and': 1L, 'shop': 1L, 'implementation': 1L, 'of': 1L, 'optimization': 1L, 'system': 1L, 'melt': 1L, 'an': 1L, 'electricity': 1L, 'production': 1L, 'in': 1L, 'integrated': 1L}, {'code': 1L, 'exploratory': 1L, 'for': 1L, 'of': 1L, 'study': 1L, 'an': 1L, 'refactoring': 1L, 'elements': 1L, 'coverage': 1L, 'impacted': 1L, 'test': 1L, 'detecting': 1L, 'faults:': 1L}, {'leave': 1L, 'climate': 1L, 'others?': 1L, 'some': 1L, 'bat': 1L, 'than': 1L, 'will': 1L, 'species': 1L, 'thirstier': 1L, 'desert': 1L, 'change': 1L}, {'and': 1L, 'high': 1L, 'requirement': 1L, 'agile': 1L, 'for': 1L, 'control': 1L, 'satellite': 1L, 'moment': 1L, 'analysis': 1L, 'stability': 1L, 'attitude': 1L, 'design': 1L, 'a': 1L, 'of': 1L, 'with': 1L, 'unit': 1L}, {'and': 1L, 'obese': 1L, 'valve': 1L, 'aortic': 1L, 'prostheses': 1L, 'generation': 1L, 'in': 1L, 'overcoming': 1L, 'new': 1L, 'transcatheter': 1L, 'challenges': 1L, 'implantation': 1L, 'hemodynamic': 1L, 'performance': 1L, 'technical': 1L, 'using': 1L, 'adequate': 1L, 'patients:': 1L, 'maintaining': 1L}, {'mobility': 1L, 'mobile': 1L, 'virtualized': 1L, 'prediction': 1L, '\{lte\}': 1L, 'edge': 1L, 'in': 1L, 'caching': 1L, 'with': 1L, 'networks': 1L}, {'spherical': 1L, 'a': 1L, 'and': 1L, 'novel': 1L, 'of': 1L, '\{uav\}': 1L, 'development': 1L, 'design': 1L}, {'and': 1L, 'measuring': 1L, 'personal': 1L, 'it': 1L, 'value': 1L, 'oncology:': 1L, 'challenges': 1L, 'cost': 1L, 'in': 2L, 'making': 1L}, {'management': 1L, 'help': 1L, 'coping': 1L, 'complexity?': 1L, 'lean': 1L, 'project': 2L, 'does': 1L, 'agile': 1L, '&': 1L, 'with': 1L}, {'mining': 1L, 'agile': 1L, 'for': 1L, 'graph': 1L, 'analysis': 1L, 'detection': 1L, '\{dns\}': 1L, 'traffic': 1L, 'using': 1L, 'cybercrime': 1L}, {'and': 1L, 'stage-gate': 1L, 'management': 1L, 'for': 1L, 'agile': 1L, 'hybrid': 1L, 'companies': 1L, 'project': 1L, 'framework': 1L, 'technology-based': 1L, 'model--a': 1L}, {'industries': 1L, 'initiatives': 1L, 'management': 1L, 'for': 1L, 'of': 1L, 'in': 1L, 'system': 1L, 'strategic': 1L, 'innovation': 1L, 'the': 1L, 'manufacturing': 1L, 'evaluation': 1L, 'integrated': 1L}, {'elastic': 1L, 'for': 1L, 'survivability': 1L, 'optical': 1L, 'software-defined': 1L, 'adaptive': 1L, 'networks': 1L}, {'case': 1L, 'a': 1L, 'storage': 1L, 'in': 1L, 'for': 1L, 'energy': 1L, 'technologies': 1L, 'automotive': 1L, 'industry': 1L, 'of': 1L, '-': 1L, 'predevelopment': 1L, 'production': 1L, 'systems': 1L, 'agile': 1L, 'the': 1L, 'study': 1L, 'electric': 1L}, {'tire': 1L, 'for': 1L, 'mobility': 1L, 'of': 1L, 'slippage': 1L, 'radical': 1L, 'enhancement': 1L, 'vehicle': 1L, 'dynamics': 1L, 'agile': 1L}, {'for': 1L, 'continuous': 1L, '\{right\}': 1L, 'the': 1L, 'model': 1L, 'experimentation': 1L}, {'building': 1L, 'a': 1L, 'module:': 1L, 'for': 1L, 'adapter': 1L, 'production': 1L, 'self-learning': 1L, 'systems': 1L, 'the': 1L, 'block': 1L}, {'available': 1L, 'among': 1L, 'probed': 1L, 'specificities': 1L, 'fluorescence-based': 1L, 'kits': 1L, 'multiplex': 1L, 'typing': 1L, '\{pcr\}': 1L, 'primates': 1L, 'with': 1L, 'species': 1L, 'commercially': 1L}, {'facilitation': 1L, 'recovery': 1L, 'for': 1L, 'of': 1L, 'assisted': 1L, 'potatorum:': 1L, 'an': 1L, 'ecological': 1L, 'agave': 1L, 'approach': 1L, 'population': 1L}, {'spectrum-based': 1L, 'ad': 1L, 'for': 1L, 'coordination': 1L, 'wireless': 1L, 'spread': 1L, 'design': 1L, 'spectrum-agile': 1L, 'networks': 1L, 'hoc': 1L}, {'a': 1L, 'development': 1L, 'and': 1L, 'web': 1L, 'managing': 1L, 'agile': 1L, 'planning': 1L, 'perspective': 1L, 'estimating,': 1L, 'under': 1L, 'projects': 1L, 'value-based': 1L}, {'on': 1L, 'use': 1L, 'agile': 1L, 'management:': 1L, 'practice': 1L, 'empirical': 1L, 'an': 1L, 'perspective': 1L, 'in': 1L, 'portfolio': 1L, 'the': 1L}, {'operations': 1L, 'chipper': 1L, 'for': 1L, 'agile': 1L, 'an': 1L, 'space-constrained': 1L, 'truck': 1L}, {'and': 1L, 'a': 1L, 'based': 1L, 'antenna': 1L, 'on': 1L, 'miniaturized': 1L, 'reconfigurable': 1L, 'film': 1L, 'thin': 1L, '\{bst\}': 1L, 'notch': 1L, 'ferroelectric': 1L}, {'and': 1L, '20-21': 1L, 'may': 1L, 'continuous': 2L, 'companies': 1L, 'challenges': 1L, 'equipment': 1L, '2014': 1L, 'analytical': 1L, 'symposium': 1L, 'manufacturing': 1L, 'meeting': 1L}, {'case': 1L, 'the': 1L, 'for': 1L, 'set-based': 1L, 'of': 1L, 'a': 1L, 'memory': 1L, 'rough': 1L, 'ecotourism': 1L, 'corporate': 1L}, {'and': 2L, 'into': 1L, '(hsi)': 1L, 'integration': 1L, 'engineering': 1L, 'at': 1L, 'human': 1L, 'airbus': 1L, 'defence': 1L, "characteristics'": 1L, 'space': 1L, 'implementation': 1L, '-': 1L, 'system': 1L, 'systems': 1L, 'approach': 1L, "'non-functional": 1L, 'lifecycle': 1L, 'a': 1L, 'of': 1L, 'practical': 1L, 'the': 1L}, {'and': 2L, 'among': 1L, '\{jaus\}': 1L, 'communication': 1L, 'collaboration': 1L, 'standard': 1L, 'unmanned': 1L, 'systems': 1L, 'formats': 1L, 'heterogeneous': 1L, 'using': 1L, '\{sae\}': 1L, 'protocols': 1L}, {'case': 1L, 'business': 1L, 'blisstrail:': 1L, 'agile': 1L, 'study': 1L, 'an': 1L, 'project': 1L}, {'breakdown': 1L, 'interaction': 1L, 'room': 1L, 'agile': 1L, 'comprehensiveness': 1L, 'an': 1L, 'task': 1L, 'in': 1L, 'improving': 1L, 'with': 1L, 'projects': 1L}, {'categorization': 1L, 'risk': 1L, 'for': 1L, 'factors': 1L, 'agile': 1L, 'distributed': 1L, 'of': 1L, 'projects': 1L}, ... ]
In [9]:
artdb.head()
Out[9]:
abstract authors id keywords
Abstract Small, self-
directed teams are ...
Yngve Lindsjorn and Dag
I.K. Sjoberg and Torgeir ...
1 Agile development,Project
management,Team ...
Abstract Agile methods in
software development ...
Taghi Javdani Gandomani
and Mina Ziaei Nafchi ...
2 Agile software
development,Agile ...
Abstract The growing
interest in Agile and ...
Indira Nurdiani and
Jurgen Borstler and ...
3 Tertiary study,Agile
software development, ...
AbstractContext Combining
software architecture ...
Chen Yang and Peng Liang
and Paris Avgeriou ...
4 Software
architecture,Agile ...
Abstract The mainstream
research into project ...
Jose Adson O.G. Cunha and
Hermano P. Moura and ...
5 Software Project
Management,Naturalistic ...
Abstract The relationship
between customers and ...
Torgeir Dingsoyr and
Casper Lassenius ...
6 Agile software
development,Software ...
Abstract Context: The
global software industry ...
Vahid Garousi and Kai
Petersen and Baris Ozkan ...
7 Software engineering
,Industry-academia co ...
Abstract The disruptive
nature of the antifra ...
Daniel Russo and Paolo
Ciancarini ...
8 Complex Systems,Software
Engineering,Antifragi ...
AbstractContext Agile
approaches are an ...
C.J. Torrecilla-Salinas
and J. Sedeno and M.J. ...
9 Agile,Scrum,Web
Engineering,CMMI,Soft ...
Abstract Considerable
attention has been paid ...
Ezequiel Scott and
Guillermo Rodriguez and ...
10 Agile software
development,Software ...
title url word_count
Teamwork quality and
project success in ...
http://www.sciencedirect.
com/science/article/p ...
{'development': 1L, 'a':
1L, 'and': 1L, 'succe ...
Agile transition and
adoption human-related ...
http://www.sciencedirect.
com/science/article/p ...
{'and': 2L, 'human-
related': 1L, 'theory': ...
The impacts of agile and
lean practices on pro ...
http://www.sciencedirect.
com/science/article/p ...
{'a': 1L, 'and': 1L,
'impacts': 1L, 'on': 1L, ...
A systematic mapping
study on the combination ...
http://www.sciencedirect.
com/science/article/p ...
{'development': 1L,
'and': 1L, 'combinati ...
Decision-making in
Software Project ...
http://www.sciencedirect.
com/science/article/p ...
{'a': 1L, 'literature':
1L, 'review': 1L, ...
Emerging themes in agile
software development: ...
http://www.sciencedirect.
com/science/article/p ...
{'on': 1L, 'emerging':
1L, 'themes': 1L, 'to': ...
Challenges and best
practices in industry- ...
http://www.sciencedirect.
com/science/article/p ...
{'and': 1L, 'a': 1L,
'literature': 1L, ...
A Proposal for an
Antifragile Software ...
http://www.sciencedirect.
com/science/article/p ...
{'a': 1L, 'for': 1L,
'an': 1L, 'manifesto': ...
Agile, Web Engineering
and Capability Maturity ...
http://www.sciencedirect.
com/science/article/p ...
{'and': 1L, 'web': 1L,
'literature': 1L, ...
Towards better Scrum
learning using learning ...
http://www.sciencedirect.
com/science/article/p ...
{'styles': 1L, 'towards':
1L, 'scrum': 1L, ...
[10 rows x 7 columns]

Compute the the tf_idf

In [10]:
artdb['tf_idf'] = graphlab.text_analytics.tf_idf(artdb['word_count'])
In [11]:
artdb.head()
Out[11]:
abstract authors id keywords
Abstract Small, self-
directed teams are ...
Yngve Lindsjorn and Dag
I.K. Sjoberg and Torgeir ...
1 Agile development,Project
management,Team ...
Abstract Agile methods in
software development ...
Taghi Javdani Gandomani
and Mina Ziaei Nafchi ...
2 Agile software
development,Agile ...
Abstract The growing
interest in Agile and ...
Indira Nurdiani and
Jurgen Borstler and ...
3 Tertiary study,Agile
software development, ...
AbstractContext Combining
software architecture ...
Chen Yang and Peng Liang
and Paris Avgeriou ...
4 Software
architecture,Agile ...
Abstract The mainstream
research into project ...
Jose Adson O.G. Cunha and
Hermano P. Moura and ...
5 Software Project
Management,Naturalistic ...
Abstract The relationship
between customers and ...
Torgeir Dingsoyr and
Casper Lassenius ...
6 Agile software
development,Software ...
Abstract Context: The
global software industry ...
Vahid Garousi and Kai
Petersen and Baris Ozkan ...
7 Software engineering
,Industry-academia co ...
Abstract The disruptive
nature of the antifra ...
Daniel Russo and Paolo
Ciancarini ...
8 Complex Systems,Software
Engineering,Antifragi ...
AbstractContext Agile
approaches are an ...
C.J. Torrecilla-Salinas
and J. Sedeno and M.J. ...
9 Agile,Scrum,Web
Engineering,CMMI,Soft ...
Abstract Considerable
attention has been paid ...
Ezequiel Scott and
Guillermo Rodriguez and ...
10 Agile software
development,Software ...
title url word_count tf_idf
Teamwork quality and
project success in ...
http://www.sciencedirect.
com/science/article/p ...
{'development': 1L, 'a':
1L, 'and': 1L, 'succe ...
{'development':
1.943867247425513, 'a': ...
Agile transition and
adoption human-related ...
http://www.sciencedirect.
com/science/article/p ...
{'and': 2L, 'human-
related': 1L, 'theory': ...
{'and':
2.1131281048492205, ...
The impacts of agile and
lean practices on pro ...
http://www.sciencedirect.
com/science/article/p ...
{'a': 1L, 'and': 1L,
'impacts': 1L, 'on': 1L, ...
{'a': 1.0683985100716131,
'and': ...
A systematic mapping
study on the combination ...
http://www.sciencedirect.
com/science/article/p ...
{'development': 1L,
'and': 1L, 'combinati ...
{'development':
1.943867247425513, 'a ...
Decision-making in
Software Project ...
http://www.sciencedirect.
com/science/article/p ...
{'a': 1L, 'literature':
1L, 'review': 1L, ...
{'a': 1.0683985100716131,
'literature': ...
Emerging themes in agile
software development: ...
http://www.sciencedirect.
com/science/article/p ...
{'on': 1L, 'emerging':
1L, 'themes': 1L, 'to': ...
{'on':
2.2220705759227504, ...
Challenges and best
practices in industry- ...
http://www.sciencedirect.
com/science/article/p ...
{'and': 1L, 'a': 1L,
'literature': 1L, ...
{'and':
1.0565640524246103, 'a': ...
A Proposal for an
Antifragile Software ...
http://www.sciencedirect.
com/science/article/p ...
{'a': 1L, 'for': 1L,
'an': 1L, 'manifesto': ...
{'a': 1.0683985100716131,
'for': ...
Agile, Web Engineering
and Capability Maturity ...
http://www.sciencedirect.
com/science/article/p ...
{'and': 1L, 'web': 1L,
'literature': 1L, ...
{'and':
1.0565640524246103, ...
Towards better Scrum
learning using learning ...
http://www.sciencedirect.
com/science/article/p ...
{'styles': 1L, 'towards':
1L, 'scrum': 1L, ...
{'styles':
5.499215308914927, ...
[10 rows x 8 columns]

K nearest neighbors

In [13]:
knn_model = graphlab.nearest_neighbors.create(artdb,features=['tf_idf'],label='title')
Starting brute force nearest neighbors model training.

Query similar articles

In [18]:
knn_model.query(artdb[artdb['id'] == 4])
Starting pairwise querying.
+--------------+---------+-------------+--------------+
| Query points | # Pairs | % Complete. | Elapsed Time |
+--------------+---------+-------------+--------------+
| 0            | 1       | 0.204499    | 0us          |
| Done         |         | 100         | 1.001ms      |
+--------------+---------+-------------+--------------+
Out[18]:
query_label reference_label distance rank
0 A systematic mapping
study on the combination ...
0.0 1
0 The impacts of agile and
lean practices on pro ...
0.631578947368 2
0 Routine interdependencies
as a source of stability ...
0.65 3
0 A systematic review on
the engineering of ...
0.666666666667 4
0 The impact of inadequate
and dysfunctional ...
0.666666666667 5
[5 rows x 4 columns]
In [16]:
artdb[['tf_idf']].stack('tf_idf',new_column_name=['word','tf_idf']).sort('tf_idf',ascending=False)
Out[16]:
word tf_idf
de 16.4976459267
\{it\} 13.7487737311
usability 12.3847249789
estate 12.3847249789
medical 12.3847249789
landscape 12.3847249789
de 10.9984306178
hospital 10.9984306178
real 10.9984306178
secure 10.9984306178
[5182 rows x 2 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [17]:
artdb[['word_count']].stack('word_count',new_column_name=['word','word_count']).sort('word_count',ascending=False)
Out[17]:
word word_count
and 3
of 3
de 3
the 3
and 3
a 3
\{it\} 3
and 3
the 3
in 3
[5182 rows x 2 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

Compute word cound and tf_idf on abstract

In [19]:
artdb['word_count_abstract'] = graphlab.text_analytics.count_words(artdb['abstract'])
artdb['tf_idf_abstract'] = graphlab.text_analytics.count_words(artdb['word_count_abstract'])
In [20]:
artdb.head()
Out[20]:
abstract authors id keywords
Abstract Small, self-
directed teams are ...
Yngve Lindsjorn and Dag
I.K. Sjoberg and Torgeir ...
1 Agile development,Project
management,Team ...
Abstract Agile methods in
software development ...
Taghi Javdani Gandomani
and Mina Ziaei Nafchi ...
2 Agile software
development,Agile ...
Abstract The growing
interest in Agile and ...
Indira Nurdiani and
Jurgen Borstler and ...
3 Tertiary study,Agile
software development, ...
AbstractContext Combining
software architecture ...
Chen Yang and Peng Liang
and Paris Avgeriou ...
4 Software
architecture,Agile ...
Abstract The mainstream
research into project ...
Jose Adson O.G. Cunha and
Hermano P. Moura and ...
5 Software Project
Management,Naturalistic ...
Abstract The relationship
between customers and ...
Torgeir Dingsoyr and
Casper Lassenius ...
6 Agile software
development,Software ...
Abstract Context: The
global software industry ...
Vahid Garousi and Kai
Petersen and Baris Ozkan ...
7 Software engineering
,Industry-academia co ...
Abstract The disruptive
nature of the antifra ...
Daniel Russo and Paolo
Ciancarini ...
8 Complex Systems,Software
Engineering,Antifragi ...
AbstractContext Agile
approaches are an ...
C.J. Torrecilla-Salinas
and J. Sedeno and M.J. ...
9 Agile,Scrum,Web
Engineering,CMMI,Soft ...
Abstract Considerable
attention has been paid ...
Ezequiel Scott and
Guillermo Rodriguez and ...
10 Agile software
development,Software ...
title url word_count tf_idf
Teamwork quality and
project success in ...
http://www.sciencedirect.
com/science/article/p ...
{'development': 1L, 'a':
1L, 'and': 1L, 'succe ...
{'development':
1.943867247425513, 'a': ...
Agile transition and
adoption human-related ...
http://www.sciencedirect.
com/science/article/p ...
{'and': 2L, 'human-
related': 1L, 'theory': ...
{'and':
2.1131281048492205, ...
The impacts of agile and
lean practices on pro ...
http://www.sciencedirect.
com/science/article/p ...
{'a': 1L, 'and': 1L,
'impacts': 1L, 'on': 1L, ...
{'a': 1.0683985100716131,
'and': ...
A systematic mapping
study on the combination ...
http://www.sciencedirect.
com/science/article/p ...
{'development': 1L,
'and': 1L, 'combinati ...
{'development':
1.943867247425513, 'a ...
Decision-making in
Software Project ...
http://www.sciencedirect.
com/science/article/p ...
{'a': 1L, 'literature':
1L, 'review': 1L, ...
{'a': 1.0683985100716131,
'literature': ...
Emerging themes in agile
software development: ...
http://www.sciencedirect.
com/science/article/p ...
{'on': 1L, 'emerging':
1L, 'themes': 1L, 'to': ...
{'on':
2.2220705759227504, ...
Challenges and best
practices in industry- ...
http://www.sciencedirect.
com/science/article/p ...
{'and': 1L, 'a': 1L,
'literature': 1L, ...
{'and':
1.0565640524246103, 'a': ...
A Proposal for an
Antifragile Software ...
http://www.sciencedirect.
com/science/article/p ...
{'a': 1L, 'for': 1L,
'an': 1L, 'manifesto': ...
{'a': 1.0683985100716131,
'for': ...
Agile, Web Engineering
and Capability Maturity ...
http://www.sciencedirect.
com/science/article/p ...
{'and': 1L, 'web': 1L,
'literature': 1L, ...
{'and':
1.0565640524246103, ...
Towards better Scrum
learning using learning ...
http://www.sciencedirect.
com/science/article/p ...
{'styles': 1L, 'towards':
1L, 'scrum': 1L, ...
{'styles':
5.499215308914927, ...
word_count_abstract tf_idf_abstract
{'and': 5L, 'strongly':
1L, 'development.': 1L, ...
{'strongly': 1L, 'and':
1L, 'modeling.': 1L, ...
{'development.': 1L,
'help': 2L, 'show': 1L, ...
{'development.': 1L,
'help': 1L, 'less': 1L, ...
{'schedule,': 1L,
'consolidated': 1L, ...
{'schedule,': 1L,
'consolidated': 1L, ...
{'help': 1L, 'lack': 1L,
'results': 2L, 'years': ...
{'help': 1L, 'lack': 1L,
'results': 1L, 'years': ...
{'phenomenon': 2L,
'results': 1L, 'years': ...
{'phenomenon': 1L,
'results': 1L, 'years': ...
{'and': 3L, 'emerging':
1L, 'development.': 1L, ...
{'and': 1L, 'emerging':
1L, 'development.': 1L, ...
{'limited': 1L, 'do)':
1L, 'show': 1L, 'being': ...
{'limited': 1L, 'show':
1L, 'being': 1L, ...
{'all': 1L,
'implementation': 1L, ...
{'all': 1L,
'implementation': 1L, ...
{'responsible': 1L,
'particularly': 1L, ...
{'particularly': 1L,
'help': 1L, 'lack': 1L, ...
{'results': 1L, 'years':
1L, 'professors': 1L, ...
{'both': 1L, 'results':
1L, 'years': 1L, ...
[10 rows x 10 columns]
In [21]:
knn_model_abstract = graphlab.nearest_neighbors.create(artdb,features=['tf_idf_abstract'],label='title')
Starting brute force nearest neighbors model training.
In [22]:
knn_model_abstract.query(artdb[artdb['id'] == 1])
Starting pairwise querying.
+--------------+---------+-------------+--------------+
| Query points | # Pairs | % Complete. | Elapsed Time |
+--------------+---------+-------------+--------------+
| 0            | 1       | 0.204499    | 0us          |
| Done         |         | 100         | 1.001ms      |
+--------------+---------+-------------+--------------+
Out[22]:
query_label reference_label distance rank
0 Teamwork quality and
project success in ...
0.0 1
0 How do personality, team
processes and task ...
0.8125 2
0 Moving from Traditional
to Agile Software ...
0.832335329341 3
0 The impact of inadequate
and dysfunctional ...
0.833333333333 4
0 The impact of inadequate
customer collaboratio ...
0.838709677419 5
[5 rows x 4 columns]
In [28]:
artdb[artdb['id'] == 4][['tf_idf_abstract']].stack('tf_idf_abstract', new_column_name=['word', 'tf_idf']).sort('tf_idf',ascending=False)
Out[28]:
word tf_idf
approaches, 1
on 1
practices 1
was 1
comprehensive 1
agile 1
software 1
combining 1
received 1
state 1
[117 rows x 2 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

Resources

In [ ]: