Sunday, January 26, 2020
Code-based Plagiarism Detection Techniques
Code-based Plagiarism Detection Techniques Biraj Upadhyaya and Dr. Samarjeet Borah Abstract- The copying of programming assignments by students specially at the undergraduate as well as postgraduate level is a common practice. Efficient mechanisms for detecting plagiarised code is therefore needed. Text based plagiarism detection techniques do not work well with source codes. In this paper we are going to analyse a code- based plagiarism detection technique which is employed by various plagiarism detection tools like JPlag, MOSS, CodeMatch etc. Introduction The word Plagiarism is derived from the Latin word plagiarie which means to kidnap or to abduct. In academicia or industry plagiarism refers to the act of copying materials without actually acknowledging the original source[1]. Plagiarism is considered as an ethical offence which may incur serious disciplinary actions such as sharp reduction in marks and even expulsion from the university in severe cases. Student plagiarism primarily falls into two categories: text-based plagiarism and code-based plagiarism. Instances of text based plagiarism includes word to word copy, paraphrasing, plagiarism of secondary sources, plagiarism of ideas, plagiarism of secondary sources, plagiarism of ideas, blunt plagiarism or authorship plagiarism etc. Plagiarism is considered code based when a student copies or modifies a program required to be submitted for a programming assignment. Code based plagiarism includes verbatim copying, changing comments, changing white space and formatting, renaming ide ntifiers, reordering code blocks, changing the order of operators/ operands in expression, changing data types, adding redundant statement or variables, replacing control structures with equivalent structures etc[2]. Background Text based plagiarism detection techniques do not work well with a coded input or a program. Experiments have suggested that text based systems ignore coding syntax, an indispensable part of any programming construct thus posing a serious drawback. To overcome this problem code-based plagiarism detection techniques were developed. Code-based plagiarism detection techniques can be classified into two categories viz. Attributed oriented plagiarism detection and Structure oriented plagiarism detection. Attribute oriented plagiarism detection systems measure properties of assignment submissions[3]. The following attributes are considered: Number of unique operators Number of unique operands Total number of occurrences of operators Total number of occurrences of operands Based on the above attributes, the degree of similarity of two programs can be considered. Structure oriented plagiarism detection systems deliberately ignore easily modifiable programming elements such as comments, additional white spaces and variable names. This makes this system less susceptible to addition of redundant information as compared to attribute oriented plagiarism detection systems. A student who is aware of this kind of plagiarism detection system being deployed at his institution would rather complete the assignment by himself/herself instead of working on a tedious and time consuming modification task. Scalable Plagiarism Detection Steven Burrows in his paper Efficient and Effective Plagiarism Detection for Large Code Repositories[3] provided an algorithm for code -based plagiarism detection. The algorithm comprises of the following steps: Tokenization Figure: 1.0 Let us consider a simple C program: #include int main( ) { int var; for (var=0; var { printf(%dn, var); } return 0; } Table 1.0: Token list for program in Figure 1.0. Here ALPHANAME refers to any function name, variable name or variable value. STRING refers to double enclosed character(s). The corresponding token stream for the program in Figure 1.0 is given as SNABjSNRANKNNJNNDDBjNA5ENBlgNl Now the above token is converted to N-gram representation. In our case the value of N is chosen as 4. The corresponding tokenization of the above token stream is shown below: SNAB NABj ABjS BjSN jSNR SNRA NRAN RANK ANKN NKNN KNNJ NNJN NJNN JNND NNDD NDDB DDBj DBjN BjNA jNA5 NA5E A5EN 5ENB ENBl NBlg BlgN lgNl These 4-grams are generated using the sliding window technique. The sliding window technique generates N-grams by moving a ââ¬Å"windowâ⬠of size N across all parts of the string from left to right of the token stream. The use of N-grams is an appropriate method of performing structural plagiarism detection because any change to the source code will only affect a few neighbouring N-grams. The modified version of the program will have a large percentage of unchanged N-grams, hence it will be easy to detect plagiarism in this program . Index Construction The second step is to create an inverted index of these N-grams . An inverted index consists of a lexicon and an inverted list. It is shown below: Table 2.0: Inverted Index Referring to above inverted index for mango, we can conclude that mango occurs in three documents in the collection. It occurs once in document no. 31, thrice in document no. 33 and twice in document no. 15. Similarly we can represent our 4-gram representation of Figure 1.0 with the help of an inverted index. The inverted index for any five 4-grams is shown below in Table 3.0. Table 3.0: Inverted Index Querying The next step is to query the index. It is understandable that each query is an N-gram representation of a program. For a token stream of t tokens, we require (t âËâ n + 1) N-grams where n is the length of the N-gram . Each query returns the ten most similar programs matching the query program and these are organised from most similar to least similar. If the query program is one of the indexed programs, we would expect this result to produce the highest score. We assign a similarity score of 100% to the exact or top match[3]. All other programs are given a similarity score relative to the top score . Burrows experiment compared against an index of 296 programs shown in Table 4.0 presents the top ten results of one N-gram program file (0020.c). In this example, it is seen that the file scored against itself generates the highest relative score of 100.00%. This score is ignored, but it is used to generate a relative similarity score for all other results. We can also see that the program 0103.c is very similar to program 0020.c with a score of 93.34% . Rank Query Index Raw Similarity File File Score Score Table 4.0: Results of the program 0020.c compared to an index of 296 programs. Comparison of various Plagiarism Detection Tools 4.1 JPlag: The salient features of this tool are presented below: JPlag was developed in 1996 by Guido Malpohl It currently supports C, C++, C#, Java, Scheme and natural language text It is a free plagiarism detection tool It is use to detect software plagiarism among multiple set of source code files. JPlag uses Greedy String Tiling algorithm which produces matches ranked by average and maximum similarity. It is used to compare programs which have a large variation in size which is probably the result of inserting a dead code into the program to disguise the origin. Obtained results are displayed as a set of HTML pages in a form of a histogram which presents the statistics for analyzed files CodeMatch The salient features of this tool are presented below: It was developed by in 2003 by Bob Zeidman and under the licence of SAFE Corporation This program is available as a standalone application. It supports 26 different programming languages including C, C++, C#, Delphi, Flash ActionScript, Java, JavaScript, SQL etc It has a free version which allows only one trial comparison where the total of all files being examined doesnââ¬â¢t exceed the amount of 1 megabyte of data It is mostly used as forensic software in copyright infringement cases It determines the most highly correlated files placed in multiple directories and subdirectories by comparing their source code . Four types of matching algorithms are used: Statement Matching, Comment Matching, Instruction Sequence Matching and Identifier Matching . The results come in a form of HTML basic report that lists the most highly correlated pairs of files. MOSS The salient features of this plagiarism detection tool are as follows: The full form of MOSS is Measure of Software Similarity It was developed by Alex Aiken in 1994 It is provided as a free Internet service hosted by Stanford University and it can be used only if a user creates an account The program can analyze source code written in 26 programming languages including C, C++, Java, C#, Python, Pascal, Visual Basic, Perl etc. Files are submitted through the command line and the processing is performed on the Internet server The current form of a program is available only for the UNIX platforms MOSS uses Winnowing algorithm based on code-sequence matching and it analyses the syntax or the structure of the observed files MOSS maintains a database that stores an internal representation of programs and then looks for similarities between them Comparative Analysis Table Conclusion In this paper we learnt a structured code-based plagiarism technique known as Scalable Plagiarism Detection. Various processes like tokenization, indexing and query-indexing were also studied. We also studied various salient features of various code-based plagiarism detection tools like JPlag, CodeMatch and MOSS. References Gerry McAllister, Karen Fraser, Anne Morris, Stephen Hagen, Hazel White http://www.ics.heacademy.ac.uk/resources/assessment/plagiarism/ Georgina Cosma , ââ¬Å"An Approach to Source-Code Plagiarism Detection and Investigation Using Latent Semantic Analysis â⬠, University of Warwick, Department of Computer Science, July 2008 Steven Burrows, ââ¬Å"Efficient and Effective Plagiarism Detection for Large Code Repositoriesâ⬠, School of Computer Science and Information Technology , Melbourne, Australia, October 2004 Vedran Juric, Tereza Juric and Marija Tkalec ,â⬠Performance Evaluation of Plagiarism Detection Method Based on the Intermediate Language â⬠, University of Zagreb
Saturday, January 18, 2020
Analyse the dramatic Essay
Analyse the dramatic importance of the end of act one ofà ââ¬ËA View from the Bridgeââ¬â¢Ã Aurther Miller is play writer of ââ¬ËA View from the Bridgeââ¬â¢ who uses a range of technique to illustrate the importance of the play, such as stage directions and language.à At the end of act 1, Miller creates an atmosphere that Alfieri is weak where he says ââ¬Å"I was so powerlessâ⬠indicating that even a lawyer who should be confident on what to do, was clueless. Further to this there is a suspense tension on what will happen next, where Alfieri visits an old lady to question about the fate of Eddie Carbone. The last statement of Alfieri after his discussion with the lady ends with, ââ¬Å"And so I waited hereâ⬠which gives a sense to the audience that Alfieri himself fears that a disaster will happen and so we are curious and anxious to find out.à In the next part of the extract, it opens up as they are a big ecstatic family, as normally Catherine boosts about Rodolfo about what he did.â⬠They went to Africa once. On a fishing boat (Eddie Glances at her) its true Eddieâ⬠representing Eddie doesnââ¬â¢t really want to know what they did. As the family talk about what the 2 submarines, Rodolfo doest want to contribute in the conversation therefore he sits near Catherine while she is ââ¬Å"reading a magazineâ⬠On stage while they were Talking about fishing boats Eddie especially concentrates on what Marco says and replies back to him in a very quick speed, ââ¬Å"Marco: sardines. Eddie: sure. (laughing) how are you gonna catch sardines on a hook?â⬠Seeing that the 2 boys were having an argument. Beatrice steps in and tries to change the subject.à The mood and atmosphere is very cheerful as Catherine goes on about Rodolfoââ¬â¢s adventure. Eddie then jokes about that they ââ¬Å"paint oranges to make them look like orangesâ⬠Marco then reacts to eddies joke thinking that he is telling the truth. Rodolfo is helping his brother by changing the subject ââ¬Å"lemons are greenâ⬠there becomes a conflict between Rodolfo and Eddie .he therefore reveals his ignorance ââ¬Å"for Christ sakeââ¬
Friday, January 10, 2020
Genetic Engineering and the Law Essay
To understand the ethical implications of genetic engineering, we must first understand what genetic engineering is. Genes are units that code for specific characteristics. Such characteristics are hair and eye colour and we inherit these from our parents. It is chromosomes in the cell nuclei than enable your body to inherit features or, more specifically, it is the DNA that makes up the chromosomes that forms a unique genetic code for every human being (apart from identical twins). It is estimated that the human body has around 50,000 to 100,000 different genes contained inside, some of which have been linked to certain diseases. Scientists claim to have identified 4,000 conditions that are linked to just one fault or defect in a persons genetic makeup, which is where genetic engineering comes in. At present a project is taking place to identify the function of every gene in the human body. ââ¬ËThe Human Genome Projectââ¬â¢ aims to uncover the cause for many diseases and find a cure for them. One such way, is genetic engineering. Genetic engineering, as a cure for disease, is the removal of a defective gene sequence and the remodelling of it. But this isnââ¬â¢t the only definition given for genetic engineering. Compassion in World Farming describes it as ââ¬Ëthe taking of genes from one species of plant or animal and inserting them into a completely different speciesââ¬â¢. It is obvious, therefore, that genetic engineering is used for different things, in different situations. In this essay I will look at some of the varying uses genetic engineering has in todayââ¬â¢s world and the ethical implications of such uses. Genetic Engineering and the Law At present human cloning is illegal in the UK, although there are many countries were such a law does not exist. And although, technically, it may be possible to clone humans in the way animals have been, the Act of Parliament strictly forbids ââ¬Ëever doing with human eggs what we have done with sheep eggsââ¬â¢ Dr Ron James Head of PPL Therapeutics. Nor are scientists allowed to mass produce human eggs for in-vitro fertilisation- something that many scientists have been pushing for for years. Genetically modified crops are also strictly controlled by the law. Such UK laws include: The Genetically Modified Organisms (Contained Use) Regulations 1992 and The Genetically Modified Organisms (Deliberate Release) Regulations 1992. These laws are in addition to the standard For Safety Act which specifies that food ââ¬Ëmust be fit for consumptionââ¬â¢. Several government bodies have been set up to assess and regulate GM foods including ACNFP, COT, FAC and, the most important, The Department of Environment. The DOE requires tat anyone proposing a release must apply to them for consent first. It is then advised by the Advisory Committee on Release to the Environment on the granting of consents. At a European level, the Regulation on Novel Foods and Food Ingredients was introduced in May 1997 and covers labelling of foods ââ¬Ëno longer equivalentââ¬â¢ to itââ¬â¢s conventional counterparts. But despite the introduction of laws, many people are still unhappy, and are pushing for further action. For example the CIWF believe GM meat should be clearly labelled, although they also say it should not be sold in the first place. They see the genetic engineering of farm animals for food as cruel and unnecessary. But the question remains: are they right? Few people know the implications of genetic engineering and what it really involves and many are ignorant of what to expect from GM. Genetic Engineering and Animals/ Humans Everyone knows the story of the first cloned animal. The Finn Dorset sheep, known as Dolly, was the first new-born mammal to be cloned from adult cells and is a miracle for scientists the world over. She had opened many new windows of opportunity for scientists who hope to soon be able to clone humans using the same technology. The possibilities really are endless. A single cell from an elite racehorse could be used to create hundreds of identical copies, each with the same elite genetic makeup. However pleasing this heady new discovery is, there is a widespread argument over whether or not cloning is right. Is it simply a wonderful new way to develop a generation of disease-free animals and humans or is it tampering with nature and playing God? Many people see it as the answer to all problems, that screening can reveal vital information about a personââ¬â¢s life span and health future. Genetic engineering could, in theory, identify genetic defects early on, giving time to replace the faulty gene and cure the sufferer. Predicting disease is a major use for genetic engineering and one that could change the way we live forever. At present scientist are working on a genetic test known as the GeneChip. They claim in a few years doctors will be able to take a simple mouth swab and, using the GeneChip, look through your DNA for disease prospects. Although they have come under fire from their critics, geneticists argue that anyone is entitled to know what their future holds for the health-wise. Indeed they say the information can be vital for planning out the rest of your life if, for example, you are a woman with a likelihood to develop breast cancer. Pre-natal diagnosis is also another option that could soon be open to the public. Parents could be made aware of any flaws there may be in their childââ¬â¢s DNA and could decide whether or not to carry on with the pregnancy. Genetic engineering could also be used to grow substances like human insulin and growth hormone on a huge level. Currently scientists are looking at introducing blood-clotting genes for haemophiliacs and purifying milk from GM sheep for the treatment of cystic fibrosis. They are also hoping to study presently incurable diseases in the hope they might be able to introduce a cure using genetic engineering. There are also high hopes for animals in genetic engineering. Transgenic animals (or those that have been given a gene from another animal) have many uses. They can produce more meat and milk, feeding the starving, and they can grow faster, with the possibility of less fatty meat. They can be bred to resist disease, but also develop disease so they might be tested on for further research. A biotechnology firm in Cambridge is working on a transgensic pig that could be bred to grow desperately needed organs for transplant into human beings. The technique can also be used to ââ¬Ëknock out genesââ¬â¢, deleting proteins so that they might prevent BSE in cows. But it isnââ¬â¢t all good news for genetic engineering, in fact there is a lengthy and strong argument as to why it is dangerous to go to take it to these levels. Many have disagreed with the predicting of disease, saying that many people may not be able to cope with the knowledge that they may contract a terminal disease- it could ruin lives. Also there has been widespread outcry over the Association of Insurance Brokersââ¬â¢ announcement that it will not offer life insurance over i 100,100 to anyone who had taken a genetic test that had predicted fatal disease and since 1995 there has been pressure form MPââ¬â¢s to develop a code of practise concerning genetic screening. There are also fears of employers discriminating against potential employees who have the potential for life threatening illness in later life. Although scientists hope genetic engineering will provide many choices for parents, the BMA has voiced itââ¬â¢s concerns that the industry will cause ââ¬Ëselective breedingââ¬â¢ or the choice to abort a baby because of undesirable characteristics such as physical traits. The BMA have also said people have been mislead about the power to screen for later abnormalities. It says ââ¬ËThe number of abnormalities which can be detected in this way is limited and few of the tests are conclusiveââ¬â¢. The problem many people have with genetic engineering is the risk of error that is involved. Screening is complex and it is difficult to be precise every time. Faulty diagnosis could put an end to job prospects or insurance benefits, not to mention the psychological problems arising from finding out you have the potential to contract a fatal disease.
Thursday, January 2, 2020
The s Theory Of Political Economy - 795 Words
Originally coined by Bernhard London (1932) in the context of the Great Depression, up to this date there is no generally accepted definition of planned obsolescence. According to a common definition by Tim Cooper, planned obsolescence is ââ¬Å"the outcome of a deliberate decision by suppliers that a product should no longer be functional or desirable after a predetermined periodâ⬠(Cooper, 2010, p. 4). Another frequently cited definition was formulated by industrial designer Brooks Stevens: ââ¬Å"Instilling in the buyer the desire to own something a little newer, a little better, a little sooner than is necessaryâ⬠(cit. in Adamson, 2003, p. 4). Whereas the former definition emphasises the planning involved in designing products, the latter foregrounds the manipulation of consumer desires and implicitly argues that these desires have become detached from actual human needs. What is frequently overlooked, however, is the inherent critique of capitalism in the narrative of planned obsolescence and how strongly it builds upon Marx s theory of political economy.1 Marx was already well aware of the fact that even perfectly functioning goods can lose their value and become obsolete, a phenomenon he termed ââ¬Å"moral depreciationâ⬠(1992, p. 264). To Marx, the reason for this lies in the capitalist logic of accumulation, which forces manufacturers to constantly innovate and modernise their means of production. The higher production capacity resulting from this, however, can only be maintainedShow MoreRelatedDemography s Classic Transition Theory1464 Words à |à 6 PagesDemography s classic-transition theory furthers the modernization narrative that is central to this argument. The theory presents a three-phase timeline to explain the reproductive history of all nations. Countries begin in the pre-transition phase where high mortality and high birth rates create slow population growth, which is considered a traditional society. The second phase is the transition where slowing mortality rates and high birth rates produce raid population growth. During this stageRead MoreThe Classical School And The Neoclassic1702 Words à |à 7 PagesToday s Economy as it relates to Classical and Neoclassical Thought Economic thoughts and theories are constantly evolving. One reason being is the growth and evolution of humans and systems. This constant change often brings about greater economics challenges. Thus, we can strongly contend to the fact that the state of todayââ¬â¢s economic isnââ¬â¢t as found in the 18th or 19th and so on. Moreover, economic theorists presented with these robust economic challenges often time build up on each other. ThatRead MoreEssay about Comparison of Urban Sociological Theories921 Words à |à 4 PagesSociological Theories In order for an urban sociologist to discover ââ¬Å"How urban societies work,â⬠theories of ââ¬Å"urban ecologyâ⬠or ââ¬Å"political economyâ⬠are used as a guide in their research. Urban ecology refers to the importance of social structure and social organization as shaping social life in the city. Urban ecologist concerns for social order, social cohesion, community ties and social differentiation offer key insight to how societies work (Kleniewski, 2001). Alternatively political economy stressesRead MoreThe Nature of Political Economyâ⬠931 Words à |à 4 PagesAn essay on Robert Gilpin ââ¬Å"The Nature of Political Economyâ⬠This chapter introduces differences and similarities between politics and economics. Both of them affect one another. In another hand, politicians and economists have different ideas and consumptions about the same facts. Also, they choose different ways of analysis. Moreover, Mr. Gilpin talks about the importance of the understanding the nature of political economy. In todayââ¬â¢s rapidly changing world, where globalization takes place andRead MoreDo Marx s Views On Capitalism Provide Any Insights Into Economic Globalisation Today? Essay1187 Words à |à 5 PagesDo Marx s views on capitalism provide any insights into economic globalisation today? Emma Hentschel ID : 216152064 Do Marx s views on capitalism provide any insights into economic globalisation today? Globalization in one of the main driving forces within today s modern world. It is the historical process and transformational development in the global arena, where growth and establishment of global connections in the international community continues to evolve. It is a widely used phraseRead MoreInsight From Theory And History1638 Words à |à 7 PagesEssay 1 Insights from Theory and History When referring to the subject of International Political Economy the main focus of study in this field looks at analyzing and finding reasons for the problems that arise or are affected from the interaction of international political decision, international economics, international trade, as well as different social systems and societal groups. Over the course and development of these systems prominent figures of economic theory and government interactionRead MorePresidential Elections Are Not Isolated From National Or World Macro Events931 Words à |à 4 Pagesuntold influence on economies and stock markets. International macro events are countless; they can even have an impact at a state level in the U.S. Below are a few examples of international macro events as derived from the California Department of Finance (CDOF) website ââ¬Å"Chronology of Significant Eventsâ⬠: â⬠¢ Global conflicts â⬠¢ Foreign energy resources â⬠¢ Foreign trade policies â⬠¢ Foreign political events â⬠¢ Emerging foreign national economies â⬠¢ Troubled foreign national economies The UnitedRead MoreThe Theory Of The Age Stratification Theory Essay972 Words à |à 4 Pagesperspective has many other theories which include, structural functionalism, modernization, disengagement, continuity, activity, life course, and age stratification. The functionalist perspective has some advantages. These perspectives have been used more often than any other perspectives in the study of aging. Some of the earliest and most influential theories which gerontology used are, the disengagement theory, activity theory, and the modernization theory. These theories all rely on the functionalistRead More An Analysis of The Dominate Perspectives of International Political Economy1532 Words à |à 7 Pages In the world of international political economy, three dominant perspectives have emerged over time. The differences and similarities between the realist/mercantilist, liberalism, and historical structuralism perspectives are significant. In this essay, I will compare and contrast these dominant perspectives. First, I will give a historical account of how each perspective originated. Then I will outline the actors involved in each perspective, explore those actorsââ¬â¢ interests, and outline which ofRead MoreLiberalism, Mercantilism , Structuralism And Constructivism1422 Words à |à 6 PagesWhen approaching political economics, there are a variety of places and perspectives one may begin from, the most common of which being identifiable as Liberalism, Mercantilism, Structuralism and Constructivism. Each of these political ideologies represent the culmination of works from many famous theorists and intellectuals, all of which help us to better understand how certain economies have the possibility of functioning in different situations. Focuses range from the power of the individual to
Subscribe to:
Posts (Atom)