Journal of Environmental Treatment Techniques  
2020, Volume 8, Issue 1, Pages: 787-793  
J. Environ. Treat. Tech.  
ISSN: 2309-1185  
Journal web link: http://www.jett.dormaj.com  
The Development of a New Data Migration  
Model for NoSQL Databases with Different  
Schemas in Environment Management System  
1
2
Lim Fung Ji *, Nurulhuda Firdaus Mohd Azmi  
1
Department of Computer Scence and Embedded Systems. Faculty of Computing and Information Technology, Tunku Abdul Rahman University  
College  
2
Advanced Informatics Department. Razak Faculty of Technology and Informatics, University Technology Malaysia  
Received: 12/06/2019  
Accepted: 22/01/2020  
Published: 20/02/2020  
Abstract  
Data migration transfers data from one database to another database. The motivations of data migration are, for example,  
transferring data from a legacy database to a modern one and maintaining data up-to-date and consistent in a distributed system.  
Compared to data migration between traditional databases, data migration between heterogeneous NoSQL databases is more  
challenging due to the characteristics of NoSQL database such as flexible schema, different supporting features, and different  
storage paradigms. The differences may cause data quality problem after data migration, especially for environment management  
system where data are required to predict or to convey accurate information. Therefore, the migration of data between  
heterogeneous NoSQL databases requires not only to overcome the differences of these databases, but also to ensure the quality of  
the migrated data. In this paper, we proposed a data migration hub, a model that uses a record to record migration style to transfer  
data between different NoSQL database schemas. The proposed hub is applicable to the environment management system with  
data validation and fault tolerance in migration process. As confirmed by the pilot study, our method is able to migrate full set of  
fields to the destinated database in MongoDB.  
Keywords: Data migration; NoSQL database; Heterogeneous schema; Document-based NoSQL, MongoDB, Environment  
management system  
Introduction1  
Data migration process transfers data from one database  
correct, precise, and up-to-date data for effective  
management of environmental issues.  
1
to another database due to several reasons such as  
transferring data from a legacy system to a new developed  
system, moving data between distributed data nodes, etc.  
This is study focuses on data migration between NoSQL  
databases. In this migration process, different storage  
paradigms of NoSQL databases need to be taken into  
consideration in order to avoid any discrepancy of data  
quality from the original data source. NoSQL database is a  
type of database that is becoming increasingly popular. It is  
applied to different areas like environment management  
systems (EMSs) that using computing technology.  
Environment management system refers a system that assists  
in the management of environmental impacts of an  
organisation and enhances the environmental performance  
of their services and products [17]. One of the examples of  
EMS application is the management program of water  
quality in Bangladesh [18]. Therefore, EMS should provide  
2 Data Migration between NoSQL Databases  
2.1 Data migration  
Data migration is defined as “a tool-supported one-time  
process that aims at migrating formatted data from a source  
structure to a target data structure, where the two structures  
differ on a conceptual and/or physical level” [1]. In addition,  
data migration not only means to transfer data to a  
destination database, but also requires to adapt the migrating  
data to the data schema, model, and types of data in the  
destinated database [2]. Migration of data involves different  
types of data stores. These data stores can either be of the  
same or different types. In case of data migration between  
NoSQL databases, the characteristics of these databases are  
required to be considered well.  
Corresponding author: Lim Fung Ji, Department of Computer Scence and Embedded Systems. Faculty of Computing and  
Information Technology, Tunku Abdul Rahman University College. E-mail: limfj@tarc.edu.my  
7
87  
Journal of Environmental Treatment Techniques  
2020, Volume 8, Issue 1, Pages: 787-793  
2
.2 Characteristics of NoSQL database  
NoSQL database is available in four generic types:  
document-based, column-based, key-value, and graph [3].  
NoSQL database has an advantage over relational database  
due to its “flexi-schema”. The “flexi-schema” behaviour  
allows different structures of records to be stored within the  
same table [4]. For example, in a document-based NoSQL  
such as MongoDB, documents (record) within the same  
collection (table) are allowed to contain different numbers of  
fields. The “shared nothing architecture” of NoSQL applies  
local storage pool that allows faster data access by adding  
number of data nodes. NoSQL database has such a high  
elasticity that replicates data to newly-added data nodes [5].  
Eventually consistency, data can be read from replicas of  
other data machine if a machine is down [4].  
Figure 1: MongoDB document [9]  
(
c) Supporting features: Even within the same type  
of NoSQL database, other features such as support on  
query language and CAP features support are different  
[
11]. The challenges mentioned above are based on the  
2
.3. Challenges of data migration in NoSQL databases  
Three challenges commonly arise in data migration [6]:  
a) interruption of business operation; (b) loss of data and  
features of NoSQL databases. In the perspective of  
data quality, data migration may face the challenges in  
maintaining the quality of data migrated. Table 2  
summarizes the challenges of data migration from the  
perspective of data quality.  
(
degradation of data consistency and (c) effort and cost  
required for data migration. In NoSQL database, data  
migration process faces some challenges related to the  
common challenges sated above, which are caused by the  
characteristics of the NoSQL database.  
Table 2: Data quality challenges of data migration [10].  
a) Heterogeneous storage paradigm: Each type of  
NoSQL database implements different ways to store data.  
Different storage paradigms have specific rules and format  
in storing the data. Table 1 summarizes the storage  
paradigms of NoSQL databases. The table indicates that re-  
formatting or restructuring of data is required in order to map  
the data to the targeted database storage structure.  
Therefore, qualities of data such as completeness,  
consistency, and correctness are concerned when data are  
being restructured or reformatted. The degradation of data  
quality will lead to higher cost of recovery and data quality  
enhancement process.  
Challenges  
Details  
The old and new database may have  
different fields. Some fields in the new  
database may not exist in the old one. The  
NULL value is used to represent the non-  
existence of data, which is critical for  
migration of data between different  
storage formats.  
Missing  
Data  
When data is migrated through manual  
approach, especially keyed by human,  
accuracy of data is not guaranteed.  
Some data may lose as the result of system  
upgrades.  
Existing data has the problem where same  
word is used for different definitions.  
Therefore, it needs further clarification on  
the value of data transferred. This problem  
affects data consistency.  
Data  
Accuracy  
Legacy  
System  
Table 1: NoSQL database storage paradigms.  
No SQL  
database  
Storage Paradigms  
Allowing embedded of key value in  
document; allowing search based on  
both key and value [7]  
Data  
Element  
Document-  
based  
Storing data in distributed, multiple  
Columnar  
dimensional map; having mixed  
database  
According to the challenges discussed in this section, it is  
clear that data migration requires not only to ensure the data  
can be migrated, but also concern the quality of migrated data.  
For NoSQL databases, the challenges discussed in this  
section have significant effects on data quality. For example,  
the flexibility of schema in NoSQL may cause missing data  
in some fields of target database.  
row/column storage [8]  
Storing data in byte-array; assessing  
Key-value  
database  
data through key-value hash table  
(each key points to a specific datum)  
[8]  
Storing data in nodes; connecting data  
by edge (edge represents the  
relationships between nodes); using  
pointer to point to another nodes [7]  
Graph database  
3 Related Work  
The authors in [2] attempted to overcome the challenge  
of different data formats between NoSQL databases. To this  
end, they proposed an approach of migrating data between  
different types of NoSQL databases by converting the  
existing data into an intermediate format to be converted later  
again to the format required by the destinated database. The  
approach migrated data between column-based NoSQL  
b) Flexibility in schema structure: The “flexi-schema” of  
NoSQL allows more flexibility in storing data [4]. For  
example, in MongoDB, documents in the same collection  
may have different numbers of fields. Figures 1 and 2 [9]  
show the examples of different schema in documents.  
7
88