International Journals
Permanent URI for this collectionhttps://dspace.psgrkcw.com/handle/123456789/178
Browse
55 results
Search Results
Item PIONEERING METHODS FOR ENHANCING PPI AND PHENOTYPE NETWORKS FOR CANDIDATE DISEASE PRIORITIZATION(International Journal of Engineering and Advanced Technology - Blue Eyes Intelligence Engineering & Sciences Publication, 2019-12) J, Maria Shyla; M, Renuka DeviThe physical contacts of high-specificity between two or more protein molecules constitute Protein-Protein Interactions (PPIs). PPI networks are modeled through graphs where node denotes proteins and edges denote interaction between proteins. The PPI network plays an important role to identify the interesting disease gene candidates. But, the PPI network usually contains false interactions. Many techniques have been proposed to reconstruct PPI network to remove false interactions and improve ranking of candidate disease. Random Walk with Restart on Diffusion profile (RWRDP) and Random Walk on a Reliable Heterogeneous Network (RWRHN) was two among them. In these methods, Gene topological similarity was incorporated with original PPI network to reconstruct new PPI network. Phenotype network was constructed by calculating similarity between gene phenotypes. The reconstructed network and phenotype networks were combined to rank candidate disease genes. However, the PPI reconstruction was fully related with the quality of protein interaction data. In order to enhance the reconstruction of PPI, a Piecewise Linear Regression (PLR) based protein sequence similarity measure and Bat Algorithm based gene expression similarity were proposed with RHN. In this paper, additional measure called Interaction Level Sub cellular Localization Score (ILSLS) is proposed to further reduce the false interaction in the reconstruction of PPI network. ILSLS is the combination of Normalized Sub cellular Localization score (NSL) and Protein Multiple Location Prediction score (PMLP). The proposed work is named as Random Walker on Optimized Trustworthy Heterogeneous Sub Cellular localization aware Network (RW-OTHSN). In order to enhance the ranking of RWOTHSN, phenotype structure is considered while construction phenotype network to rank the candidate disease genes. The phenotype structure is characterized based on h*-sequence model which identify highly discriminative signatures with only a small number of genes. This proposed work is named as Random Walker on Optimized Trustworthy Heterogeneous Sub Cellular localization and Phenotype structure aware Network (RWOTHSPN). The efficiency of the proposed methods are evaluated on PPI network database in terms of Average degree, Relative Frequency for PPI reconstruction, Number of successful predictions, precision and recall for candidate disease gene ranking.Item PRIORITIZATION OF CANDIDATE GENE ASSOCIATED WITH DISEASES IMPROVED BY RANDOM WALKER ON OPTIMIZED TRUSTWORTHY HETEROGENEOUS NETWORK(Jour of Adv Research in Dynamical & Control Systems, 2019-04) J, Maria Shyla; M, Renuka DeviCandidate gene associated with diseases could be ranked by the reconstruction of PPI Network. In current biomedical research, the prioritization of candidate gene is the most essential issue. A reliable heterogeneous network was used for candidate gene prioritization for diseases. This network was constructed by fusion of reconstructed Protein-Protein Interaction (PPI) network by topological similarity, relationship between diseases and genes of proteins and phenotype similarity network. Then, the candidate genes were prioritized by Random Walker on the Reliable Heterogeneous Network (RWRHN) which is a random walk-based algorithm. PPI network reconstruction by protein characteristic further improved the prediction accuracy of disease genes. In this paper, the prioritization of candidate gene for diseases is further improved by proposed Random Walker on Optimized Trustworthy Heterogeneous Network (RW-OTHN) which additionally considering the protein sequence similarity and gene expression profile similarity while reconstructing PPI network. The protein sequence similarity is calculated by piecewise linear regression model. The gene expression profile similarity is calculated by applying sub space clustering on high dimensional gene expression profile data. The subspace clustering is processed by multi objective BAT algorithm and K means clustering. The prioritization of candidate gene is improved with the consideration of protein sequence similarity and gene expression profile similarity in PPI network reconstructionItem ANALYSIS OF VARIOUS DATA MINING TECHNIQUES TO PREDICT DIABETES MELLITUS(Research India Publications, 2016-01) J, Maria Shyla; M, Renuka DeviData mining approach helps to diagnose patient’s diseases. Diabetes Mellitus is a chronic disease to affect various organs of the human body. Early prediction can save human life and can take control over the diseases. This paper explores the early prediction of diabetes using various data mining techniques. The dataset has taken 768 instances from PIMA Indian Dataset to determine the accuracy of the data mining techniques in prediction. The analysis proves that Modified J48 Classifier provide the highest accuracy than other techniques.Item ASSOCIATION RULE MINING FOR CLIQUE PERCOLATION ON COMMUNITY DETECTION(SERSC, 2020-01) Sathiyakumari K; Vijaya M.SThe recognition of communities linking like nodes is a demanding subject in the revision of social network data. It has been extensively considered in the social networking community in the perspective of underlying graph structure besides communication among nodes to progress the eminence of the discovered communities. A new approach is proposed based on frequent patterns and the actions of users on networks for community detection. This research work spends association rule mining to discover communities of similar users based on their interests and activities. The Clique Percolation technique initially anticipated for directed networks for driving communities is enlarged by using the ascertained prototypes for seeking network components, i.e., internally tightly linked groups of nodes in directed networks discovering overlapping communities efficiently. The community measures such as the bulk of the community, piece of community and modularity of the community are used for testing the reality of communities. It tests the proposed community detection approach using a sample twitter data of sports person networks with F-measure and precision showing that the proposed method principals to improve the community detection quality.Item OVERLAPPING COMMUNITY STRUCTURE DETECTION USING TWITTER DATA(International Journal on Emerging Technologies, 2019-12) Sathiyakumari K; Vijaya M SOverlapping community detection is progressively becoming a significant issue in social network analysis (SNA). Faced with massive amounts of information while simultaneously restricted by hardware specifications and computation time limits, it is difficult for clustering analysis to reflect the latest developments or changes in complex networks. Techniques for finding community clusters mostly depend on models that impose strong explicit and/or implicit priori assumptions. As a consequence, the clustering effects tend to be unnatural and stay away from the intrinsic grouping natures of community groups. In this method, a process of enumerating highly cohesive maximal community cliques is performed in a random graph, where strongly adjacent cliques are mingled to form naturally overlapping clusters. These approaches can be considered as a generalization of edge percolation with great potential as a community finding method in real-world graphs. The main objective of this work is to find overlapping communities based on the Clique percolation method. Variants of clique percolation method such as Optimized Clique percolation method, Parallel Clique percolation method have also been implemented. This research work focuses on the Clique Percolation algorithm for deriving community from a sports person’s networks. Three algorithms have been applied for finding overlapping communities in the sports person network in which CPM algorithm discovered more number of communities than OCPM and PCPM. CPM overlapping algorithm discovered 198 communities in the network. OCPM algorithm found 180 different sizes of communities. PCPM algorithm discovered 170 communities and different size of the node in the graph. The community measures such as size of the community, length of community and modularity of the community are used for evaluating the communities. The proposed parallel method found a large number of communities and modularity score with less computational time. Finally, the parallel method shows the best performance is detecting overlapping communities from the sports person network.Item APPROACHES FOR FINDING COHESIVE SUBGROUPS IN SOCIAL NETWORKS USING MAXIMAL K‐PLEX(Academic publisher, 2018-07) Sathiyakumari K; Vijaya M SA k‐plex is a clique relaxation brought in social network analysis to version cohesive social subgroups that allow for a confined wide variety of nonadjacent vertices inside the cohesive subgroup. Numerous algorithms and heuristic processes to discover a most‐size k‐plex inside the graph had been developed these days for this np‐hard problem. This work introduces and researches the maximum k-plex trouble, that's a mission in social community analysis, and graph-based records mining. The most clique trouble presents a classic framework for detecting cohesive sub graphs. A clique model is one of the maximum important strategies on the cohesive sub graph detection; but, its programs are instead restrained because of restrictive conditions of the model. Subsequently lots studies resorts to k-plex - a graph wherein any vertex is adjoining to all however at most k vertices - which are a rest model of the clique. This work proposes to compute most k-plexes via exploiting the structural houses of the network. Additionally, it focuses on the maximal k-plex algorithm for deriving sub-agencies from a sports person’s network and uses sub graph measures such as in-degree k-plex and out-degree k-plex for comparing the sub-communities.Item PROTEIN SEQUENCE DETECTION USING ENSEMBLE LEARNING(Academic publisher, 2018-05) Sathiyakumari K; Sasikala PThe main objective is to predict the structure of proteins. The key to the wide variety of functions shown by individual proteins is in their three dimensional structure adopted by this sequence. The main objective is to understand protein function at the molecular level, it is important to study the structure adopted by a particular sequence. This is one of the greatest challenges in Bioinformatics. There are 4 types of structures; Primary structure, Secondary structure, Tertiary structure and Quaternary structure. Secondary structure prediction is an important intermediate step in this process because 3D structure can be determined from the local folds that are found in secondary structures by ensemble learning methodology.Item MAXIMAL K-CORE SUB GRAPH ANALYSIS OF TWITTER DATA NETWORK(Institute of Advanced Scientific Research, 2018-04) Sathiyakumari K; Vijaya M SThe detection of the community structure in big complex networks is a promising field of research with many open challenges. Community sub graphs are characterized by means of dense connections or interactions amongst its nodes. Community detection and evaluation is a critical venture in graph mining. A spread of measures had been proposed to evaluate the nice of such groups. Community sub graphs are characterized via dense connections or interactions amongst its nodes. In this paper, it evaluates groups primarily based on the okay-center idea, as a method of comparing their collaborative nature belongings no longer captured by way of the single node metrics or by the installed network evaluation metrics. This subgraphs specializes in the maximal ok-center set of rules for deriving sub-agencies from a sports activities man or woman’s community and uses subgraph measure for comparing the sub-groups. The sub graph measures which include total degree k-core, in-degree k-core, out-degree k-core, and transitivity are used. Primarily based at the k-core, which basically measures the robustness of a community beneath degeneracy, it extends to weighted graphs, devising a novel concept of k-cores on weighted graphs.Item MAXIMAL CLIQUE AND K-CLIQUE ANALYSIS OF TWITTER DATA NETWORK(Academic publisher, 2018-01) Sathiyakumari K; Vijaya M SThe maximal clique problem (MCP) is to decide a sub graph of good cardinality. A clique is a sub graph wherein all pairs of vertices are jointly adjoining. The detection of communities in social networks is a challenge. An actual way to version network is maximal cliques, i.e., maximal sub graphs in which every pair of nodes is hooked up via a side. A contemporary method for locating maximal cliques in very big networks is to decompose the network into blocks after which a distributed computation is carried out. Those strategies exhibit a change-off between performance and completeness, focuses decreasing the size of the blocks will enhance performance but some cliques may stay undetected on account that high-degree node, also referred to as hubs, and won't fit with their whole neighborhood into a small block. This paper presents a disbursed method for suitably managing hub nodes and is able to find maximal cliques in huge networks assembly each completeness and efficiency. The method relies on a -level decomposition system. The first degree targets at recursively figuring out and setting apart tractable quantities of the network. The second level similarly decomposes the tractable portions into small blocks. This work focus on maximal clique set of rules for deriving sub-groups from a sports person’s network and makes use of sub graph measures for evaluating the subcommunities. The sub- graph measures used are degree, in-degree, out-degree, closeness, subgraph centrality, Eigen vector centrality, nodal centrality. This research paintings is able to properly locate all maximal cliques, supplied sparsely of the community is bounded, as its miles the case of actual-world social networks. Experiments confirm the effectiveness, performance, and scalability of our answer.Item PROTEIN SEQUENCE ANALYSIS FOR BREAST CANCER DISEASE(Int. Journal of Engineering Research and Application, 2017-09) Sasikala P; Sathiyakumari KThe furthermost challenge facing the molecular biology community today is to make sense of the wealth of data that has been produced through the genome sequencing projects. The cells cover a central core called nucleus, which is warehouse of an important molecule known as DNA. These are packaged in small elements know as chromosomes. They are collectively known as the genome. While the computerized applications are used all around the world, there come to mind that the collection of a vast amount of data are accessed by peoples. The significant information hidden in vast data is attracting the researchers of multiple regulations to make study in developing effective approaches to gain the hidden knowledge within them. In protein and DNA analysis, the sequence mining techniques are used for sequence alignments, sequence searching and sequence classifications. The researchers are showing their interest on protein sequence analysis, in the field of protein sequence classifications. It has the capability to discover the persistent structures that exist in the protein sequences. This work explains various techniques methods to analyze protein sequence data and also provides an overview of different protein sequence analysis methods.