Friday, April 3, 2009

Posting 5: Introduction of SQL

SQL

Structured Query Language (SQL) is the standardized computer language used to interface with database, which is used to retrieve and manage data in the relational database systems. IBM first designed and applied the language in its early development of database system. In 1986, SQL was identified as the standard language by ANSI for the relational database management system; In1987, ISO supported and made it become the international standardized language. In fact, although most database systems use SQL, most of them also have their own additional proprietary extensions that are usually only used on their systems. At present, ANSI SQL 89 is adopted by most of database systems.

SQL is an advanced programming language, which allows users to work on high-level data structure regardless of the way and the method of data storage. Its interface can use the same SQL as input and management of data between database systems with different underlying structures and between different databases. All SQL statements can accept record set for input and send record set for output. The output of a SQL statement can become the input of another SQL statement. It shows that SQL statements can be overlapped to use. So SQL possesses great flexibility and utility. In most cases, just a SQL clause instead of a group of clauses of other programming languages can express a complete event.

The SQL language includes three parts:
1. Data Definition Language (DDL)
2. Data Manipulation Language (DML)
3. Data Control Language (DCL)

1. DDL can be used to define new tables and associated elements. Its most basic items are the CREATE, ALTER, RENAME, TRUNCATE and DROP statements:
For example, when we want to create a customer table, we need to use the create statement as follows:

CREATE TABLE Customer
(First_Name char(50),
Last_Name char(50),
Address char(50),
City char(50),
Country char(25),
Birth_Date date)

2. DML can be used to add, update and delete data and control transactions.
For example, we use update statement to modify the sale amount of $400 to $600 occurred in March 29 in Montreal in a Store_Information table as follow:



UPDATE Store_Information
SET Sales = 600
WHERE Store_name = "Montreal" AND Date = "Mar-29-2008"

3. DCL allows the user to control who has access to see or operate data within a database, including two keywords: Grant and Revoke.
GRANT designates several users to work on an operation or a set of operations.
REVOKE eliminates or limits the right of a user to perform an operation or a set of operations
For example, we use the Grant statement to authorize a user to manipulate data in the Store_Information table mentioned above:

GRANT SELECT, UPDATE
ON Store_Information
TO Joe, Jim;

The SQL language can also be broken down into several language elements, including clauses, expressions, predicates, queries, statements.
In order to retrieve information from a database, we use a query that is the most common operation in SQL language, which is separate from Data Manipulation Language.
For example, if we need to pick up the store_name for sales=$400 from Store_Information mentioned above, we use select statement illustrated as follow:

SELECT Store_name
FROM Store_Information
WHERE Sales=$400

Problem of Security

Using SQL also has the safety difficulty. Now a worse attack to SQL is SQL Injection attack, which can add some destructive command to destroy database systems (DROP TABLE, DROP DATABS, OR DELETE * FROM some table). It will lead to service disruption or paralysis of system. Now the best way to defense against SQL Injection is to enforce the Use of Parameterized Statements.

Reference:
http://kb.iu.edu/data/ahux.html
http://en.wikipedia.org/wiki/SQL
http://www.geocities.com/SiliconValley/Vista/2207/sql1.html
http://msdn.microsoft.com/en-us/library/ms161953.aspx

Posting 4: Dicussion of Metadata

MetadataItalic

Metadata has been used in IT for many years. Especially, with the ubiquity of WWW , the increasing need for access to different distributed information and the increased use of various sources all demand common representation of metadata.


Metadata is sticked to data and interpret data. Using the attached metadata, the data in metadata processing systems is deciphered. For example, for WWW and systems engineering, metadata is essential for e-business from advertising through directory information items through oringinal enquiry to contract, purchase, delivery and subsequent guarantee or maintenance.


Role of Metadata


Metadata is data about data. Metadata can describe a data source, a particular collection of data, including a file, a database, a table in a relational database, a class in an object-oriented database, an instance of data like tuple in a relational database table and object instance in a class within an object-oriented database, or data associated with the values of an attribute within a domain like the particular value of an attribute in one instance; metadata can describe data models, processes, including an overall processing system environment, a processing system, a process, a component of a process; It can describe a software, a program, or a specification; It can describe an event system, an individual event, a constraint system and an individual constraint; it can describe people and their roles in an IT system; it can describe an organisation, a department, individuals or individuals in a certain role. Metadata can be divided into five types: descriptive, administrative, structural, technical, and use metadata.

Different type of metadata own different standardisation to aid in data exchange, such as the standard of engineering, healthcare, or libraries. Exchange metadata for internet resources – the Dublin Core - has been proposed and used, and a general metadata model, RDF (Resource Description Framework) has been proposed with the implementation language XML.


Importance of metadata


Due to complexity of society, the increasing number of electronical information sources and the increasing number of users, metadata has increasingly become important. It is essential to be used in effective queries, explaining answers, coodinating between information sources. Metadata becomes the glue that holds together the large amount of information and e-Businesss on the internet. Metadata is essential for WWW to be used effectively, for finding relevant information and for holding together data and information from different sources. Metadata is necessay for simplifying queries so that they select those information that the user wants. Metadata is essential for understanding the structural information, its quality and its relevance. Metadata is essential in interprating answers from some complex IS. The following contents introduce the detailed impotance of metadata.


  1. Metadata can act as an accelerator and improver to searching for resources. For example, through web browser to search and download a certain free software, we can quickly find it by using the relative descriptive metadata.
  2. Metadata can also be used to find files manually. For Example, when searching historical file administered and stored in a metadate repository, using metadata queries like " type=history", we can quickly find it.

  3. Metadata can reduce semantic gap and increase linked word phrases to data. For example, if a search engine understands that "VAn Gogh" was a "Dutch painter", it can answer a search query on "Dutch painters" with a link to a web page about Vincent Van Gogh, although the exact words "Dutch painters" never occur on that page.
  4. Metadata is an important structural data for the application of DBMS. For example, when developing a DBMS, establishing central database is the first step. Thus, we need to define entities, attributes, relationship between them using metadata. In the following illustration, using metadata to describe the type of each field. The field name in each table is known as metadata.


(http://151.100.3.84/DataTimeReportPortal/AboutMetadata.htm)


5. Technical metadata has improved search engine in World Wide Web. In order to obtain useful information from the mass of information available, it is necessary to use metadata to add value for ensuring consistency and variety.

6. Data warehousing is managed by the application of metadata to conveniently organize and retrieve a great deal of business or organicational information and data.

References:

http://en.wikipedia.org/wiki/Metadata

http://marinemetadata.org/guides/mdataintro/mdatadefined


Posting 3: DBMS Architectures

Introduction

Architecture of a database system presents the design principles of DBMS , including process examples, parallel structure, storage system design, transaction system implementation, query application, and so on. It is helpful reference to database researchers, practitioners, and other fields' people, who are interested in database system design techniques. The achitecture of DBMS can be broken down into two major structures: logical and physical architectures. Here, I only discuss the logical architecture of DBMS.

Logical Architecture of DBMS

Based on suggestion of the ANSI/SPARC Study Group, logical architecture of DBMS can be divided into three levels: external level, conceptual level, and internal level.



Logical DBMS Architecture
(www.dbmaker.com.tw/reference/manuals/tutorial/tutorial_03.html)
  • External Level

External level can be also called as view level. It functions as a window on which application programs can work and end users can capture information and manipulate data what they want. The users of the level do not directly associate with the storage and system administration of data except for the system and database adminstrators. In the level, application programmer uses a HOST LANGUAGE: COBOL, PL/1, C and embedded in the host language is a DATA SUBLANGUAGE DSL.

For example:

External (PL/1)

DCL 1 EMPP,

2 EMP# CHAR(6),

2 SAL FIXED BIN(31);

External (COBOL)

01 EMPC.

02 EMPNO PIC X(6).

02 DEPTNO PIC X(4).

  • Conceptual Level

Conceptual level, which is the first design phase of a DBMS, presents a logical view of the entire database as a whole. All the data from external level are collected in the central database in the level. There is a data definition language to manipulate the data based on what you need.
Data in the level is independent. Data definition language define the information content.

For example:

EMPLOYEE

EMPLOYEE_NUMBER CHARACTER (6)

DEPARTMENT_NUMBER CHARACTER (4)

SALARY NUMERIC (5)

  • Internal Level

Internal level may also be referred to as the physical level. The collection of files permanently stored on secondary storage devices is known as physical database. The level is the closest to physical storage, and it provides a low-level description of the physical database and an interface between the operating system's file system and the record structures used in higher levels of abstraction. In the level, both record types and methods of storage are defined as well as how stored fields are represented, what physical sequence the stored records are in, and what other physical structures exist.

For example:

STORED_EMP LENGTH-18

REFIX TYPE=BYTE(6), OFFSET=0

EMP# TYPE=BYTE(6), OFFSET=6,INDEX=EMPX

DEPT# TYPE=BYTE(4), OFFSET=12

PAY TYPE=FULLWORD, OFFSET=16

Mappings

The above three levels are not separate but associated with each other. There are two types of mappings: conceptual/internal mapping and external/conceptual mapping.

Conceptual/internal mapping defines the association between the records and the fields of the conceptual level and the files and data structures of the internal level. If the structure of the stored database is changed, then the conceptual/ internal mapping must also be changed accordingly so that the view from the conceptual level remains constant.

External/conceptual mapping defines the association between a particular external view and the conceptual view. Although these two levels are similar, some elements found in a particular external view may be different from the conceptual view.

Reference:

http://en.wikipedia.org/wiki/ANSI-SPARC

http://www.dbmaker.com.tw/reference/manuals/tutorial/tutorial_03.html

Thursday, April 2, 2009

Posting 2: Discussion of Advantages

Advantages of Using DBMS

(http://www.sybase.com/)

Compared with the file processing approach, which there are large volume of repetitive data in computer files and the work for changing data stored in different places is extremely heavy and complex when the programming code were altered, DBMS avoids the redundancy and inconsistency of processing data. Under a DBMS, most databases usually manage several files, tables, or entities, and hundreds, or even thousands, of tables can be managed simultaneously. (Jessup, Valacich, and Wade, 2008) For example, when change is done in a certain table, other interrelated data through all parts of DBMS in a business can be changed automatically; when changing programming code in any application, the business’ database does not need to be changed. Therefore, there are lots of advantages using DBMS to manage the data of business and organizations.

Under DBMSs, programs and data are separate, so it is easy to change either some programs or data according to business needs; for example, while a student’s address in the database of a university needs altering, no needs change any programs. Because of using a centralized DBMS, it is much more convenient and easier to share data and enforce standards and rules for processing data among an organization; the standardized data in DBMS make it easier to build and modify application; because using DBMS can minimize redundancy and inconsistency, data quality can be largely improved. In addition, either new personnel in an organization or outside users can have access to the organizational data easily due to centralized database system. Using DBMS can increase productivity and reduce the time of maintaining programs. There are also other advantages of DBMSs, including data-oriented and user-oriented security, administration software, and data recovery support. A practical benefit is reduced application development time: the system is already built, it needs only data and queries to access the data.


References
Jessup, L., Valacich, J. & Wade, M. (2008). Information System Today: Why IS Matters. Pearson Education, Inc.

MACurran & Kellum, S. (July15, 2000). Nursing Informatics: Advantages & Disadvantages. Retrieved March 31, 2009,
form http://personal.uncc.edu/macurran/macurran3/coni/advdisadv.htm

Posting1: Introduction and History of DBMS



Introduction

DBMSs are a set of software programs that allow organizations store, manage, retrieve, and analyze information from a single or several databases. All information helping an organization to do business is stored in databases which include a collection of related data organized in a way that facilitates data searches. Data is retrieved from a database through the use of queries and reports, and the users of information systems use DBMS’ interface language SQL to interact with the data from a database. The main method structuring the relationships among data entities is the relational database model. By normalizing complex databases, the users can easily maintain and manipulate data. (Jessup, Valacich, and Wade, 2008) There are different categories of DBMSs from small systems, which can be operated in PC, to huge systems, which can be running in mainframes.

History
In the earliest days of electronic computing, databases were applied to the older systems in the form of custom. (Ross, 2009) At first, only some large organizations could afford to use DBMS with applying the hardware to support large data sets. They used the file processing approach to store and operate data electronically. Data were usually held in computer files that were long, sequential, and often stored on magnetic tape. Information including identical data often was put in several different places throughout the information system. The programmers of the file had not yet found the way to solve the redundant problem about storing data, so repetitive data about customers, suppliers, or other companies used to be stored in several files. If a certain code was changed, related data in each file had to be changed. (Jessup, Valacich, and Wade, 2006)

In the early 1960s, Database Task Force Group within Codasyl was responsible for beginning to create and standardize the computer language (COBOL) in databases. There were no both functions of “find” and “search”, so it is difficult to search some data what someone wanted.

In 1968, IBM established their own DBMS during the development of “Apollo project”, which was called as IMS using hierarchical database model as the method of data navigation.
In 1971, CoastyI approach” was established by “Database Task Group”, and the method became the formal standard to database management, whose navigational model was network model. The system was a linked-list one.

In 1973, both hierarchical database model and network model were developed to be the navigational databases.

Due to the lack of the function of “search” in the navigational databases, Edgar Codd, woking at IBM, wrote a paper to debate with CoastyI approach and introduced “A Relational Model of Data for large shared Data Banks”. In the new database construction, the data is split into a series of normalized tables where each row is a record and each column is an attribute (field). A record is a collection of related attributes about a single table. For a user, some related tables that store different information need to be built with its related attributes by linking them with a unique “key” defining a related table. Also, in order to solve the necessary looping, Codd suggested that should design a set-oriented language, which will eventually be developed to be the SQL.

After Codd’s demonstrations, two projects about producing relational products were initiated, the first one of which was Ingres beginning in 1973, which produced one test product that was widespread used in 1979; the other was IBM’s System R, which produced two versions, including “quickie” and multi-user version that had been added the query language, SQL.

In 1976, IBM delivered a prototype for a relational DBMS. From then on, the competition of development of SQL production began.

In 1977, a specified database was designed to government agencies by computer programmers, Larry Ellison and Robert Miner, who found the Oracle systems. Corp. Meantime, they learned from IBM, also using SQL to produce a relational DBMS similar to IBM’s.

In 1978, Oracle released the first RDBMS. Relational databases eventually divided by all other database types, because highly complex queries can be used and various tools’ usefulness was largely enhanced.

In 1983, the first portable RDBMS was delivered by Oracle, which could use on various platforms like mainframes, workstations, and PC. After that, a distributed DBMS with the base on SQL-Star software was initiated, which could have access to data stored on a network if these data was placed in a single computer.
In 1993, Microsoft Corp. invented Access. With RDBMS grow quickly, and as components of computer price began decrease, lots of business DBMSs occurred on the market. Therefore, in order to gain competitive advantage, many organizations accepted the DBMSs and began using related information systems. (Mann, 2006)


References:
Jessup, L., Valacich, J. & Wade, M. (2008). Information System Today: Why IS Matters. Pearson Education, Inc.

Mann, M. (June 28, 2006). History and Comparison of Relational Database Management Systems. HVB Information Services, HIS14DB

Ross, S. (2009). The History of Database Management Systems. Ezine @rticles. Retrieved March 31, 2009 form
http://ezinearticles.com/?The-History-of-Database-Management-Systems&id=904394

Webopedia. Database Management System. Retrieved March 31, 2009 from
http://www.webopedia.com/TERM/D/database_management_system_DBMS.html.