Systems Analysis Techniques
There has always been a blurred borderline between techniques for systems analysis and techniques for systems design. In this document analysis techniques are considered relevant for requirements elicitation and for modelling existing systems. The most appropriate techniques for modelling existing systems are Data Flow Modelling and Normalisation, however we begin by discussing the nature of corporate information and by discussing the interview as a technique for requirements elicitation.
Prior to discussing the techniques available for identifying and recording information a brief description of the nature of organisational information will be useful.
Flow of Information
Information is the life blood of many organisations. Information flows into and out of organisations and between the different levels in an organisation. Typically at the lower levels of an organisation, information is very detailed, e.g. relating to individual customers, orders, suppliers, invoices etc. As information flows up the hierarchy of an organisation it tends to become summarised. As an example consider a banking environment. A teller is interested in the specific information about the account they are currently dealing with, such as account number and balance. At the end of each day the branch manager may receive a summary report showing the total of all balances of accounts at that branch, together with a short list of those individual customers with balances of less than -£500 or greater than £5,000 (an example of exception reporting). At the end of each week the area manager or director may receive a list of customers with balances of greater than £10,000. The flow of information in an organisation is described in more detail in Unit 3 (Organisations and Information technology). It is important in this context because much of the work of the systems analyst/designer is concerned with identifying the flow and structure of information within an organisation.
Different Types and Levels of Data/Information
There is a distinction between data and information which can be simply described by the statement ‘information is data which has been processed such that it becomes meaningful’. In other words information is data which has been placed in a specific context. Consider the number 153. On its own this is meaningless data. When placed In a particular context, e.g. £153, the number becomes more meaningful because we now know that we are referring to 153 pounds. However this is still data rather than information. £153 only becomes information when you find out that it is the balance of account number 01234567.
Uses of Information
There are many uses of information :-
- To fulfil legal requirements;
- To provide background knowledge;
- Decision support (better information results in better decisions);
- Enquiry processing;
- Analysis of trends (forecasting).
Standard Documents & Sources of Data/Information
There are many standard documents commonly in use in organisations, some have been mentioned already, e.g. profit and loss statements, balance sheets, bills, payslips. Others include order forms, application forms, delivery notes, invoices, business letters etc. These can provide very useful information to the systems analyst. Other sources of data include existing computer systems and their documentation, the internet, tables of data in magazines and newspapers
Information Gathering Techniques / Interviews
The purpose of an interview is to identify, how a person currently does their job (how the existing system works), the problems they face (what is wrong with the existing system) and how they would like to do their job (what is required of the new system).
A major factor in conducting interviews are the attitudes of the interviewee and interviewer. Remember that your not there to impress the interviewee with your knowledge of computers, so don’t talk about megabytes, hard disk sizes and processor speeds.
Q. Who uses computers?
A. Mothers, Fathers, Aunts, Uncles, Brothers, Sisters etc.
Think of your friends and relations, what is their range of computing experience? The people you interview may be experts or complete novices, but they are likely to be apprehensive about the impact that a new computer system will have on their jobs (will I still have a job?, will I have to learn new skills? will I have to change the way I work?). As a systems analyst you have to build trusting relationships with these people in order to get the best information from them. An open, friendly, reassuring attitude is required.
Interview essentials :-
- Preparation You need to be well prepared, you have to inspire the confidence of the people you interview by demonstrating that you can understand what they do and appreciate the problems they face. Some detailed background reading is advisable;
- Convenience A further point is that you should interview people at their convenience, at their place of work. Don’t expect to get very far if you summon people to your office at short notice;
- Dress Code Respect the dress code of the people you are interviewing, don’t expect to inspire confidence if you are interviewing a suit while dressed in tee-shirt, jeans and trainers. Not that dressing smartly will automatically inspire confidence or that it is impossible to inspire confidence if you are dressed casually, just that dressing casually in a semi-formal environment puts you at a disadvantage and you will have to work harder in order to inspire confidence
- Body Language How many of you have felt uncomfortable when a relative stranger sits too close and invades your space or when someone won’t look you in the eye (or indeed when someone looks into your eyes too much). Try not to use threatening body language when interviewing, a useful tip is not to sit across a table from someone (this is uncomfortably reminiscent of helping the police with their enquiries). On the other hand don’t sit right next to someone since you can get too close and it is difficult then to have eye contact. The ideal position to sit is at right angles to someone, this isn’t threatening or too close, allows eye contact but also allows eye contact to be broken comfortably.
Types of Question :-
When conducting interviews there are different types of question which can be asked and it is important to know which kinds to use :-
- Open Ended Questions avoiding terse replies and inviting the interviewee to develop their opinions. These tend to start: who, where, what, when, why, e.g. "What does your job involve?";
- Closed Questions seeking precise information, e.g. "Do you have...?" or "Have you done...?";
- Rhetorical Questions which do not require an answer, e.g. "We all want to improve productivity, don’t we?";
- Leading or Loaded Questions Questions that suggest the answer, e.g. "I believe in the strict control of expenditure and debtors, what about you?". Sometimes loaded questions can put people in a no win situation, e.g. "When did you stop beating your husband?".
Try to avoid using loaded or rhetorical questions since they serve no useful purpose. Loaded questions in particular may cause offence because they imply that you already know the answer and will stick to it regardless of what the person actually says. It is also important that you avoid answering your own questions, bite your tongue! A useful approach is to use open questions to get the big picture and progressively move towards closed questions as more detail is uncovered.
Listening Skills :-
So much for questioning, the next step is listening to the answers. You need to show the interviewee that you are listening to their answers and that you are interested by :-
- Making eye contact when speaking and listening (but don’t overdo it);
- Adopting a relaxed posture (leaning slightly forward conveys interest, leaning back conveys boredom);
- Using verbal reassurance, e.g. "I see what you mean" but again don’t overdo it as the same phrase used repeatedly can become aggravating;
- Paraphrasing, this is a method which involves restating in summary format your understanding of what the interviewee has said;
- Clarifying, don’t be afraid to tell the interviewee that you don’t understand something but be careful how you say it, e.g. "I’m sorry but I didn’t quite understand the last part, could you take me through it again" is obviously preferable to "You didn’t explain that very clearly can you say it again".
The objectives of this section are to: provide definitions of the terms Data Flow Model and Data Flow Digram, explain the components and representations which comprise Data Flow Diagrams, introduce a step by step procedure for creating Data Flow Diagrams.
What is a Data Flow Model?
A Data Flow Model (DFM) defines the passage of data through a system. The DFM comprises of a consistent set of Hierarchic Data Flow Diagrams (DFD)and associated documentation. slide 3
Hierarchic Data Flow Diagrams
The word hierarchy implies that there are different levels of complexity. In terms of Data Flow Diagrams (DFD) there may be up to 4 levels. At the highest level (often called a context diagram and sometimes called a level 0 DFD) all the complexities of the internal workings of a system are hidden from view by representing the entire system as one black box process which receives input data flows and transmits output data flows. The next level down consists of a single level 1 DFD which provides an overview of the 6 - 10 processes which a typical system comprises. Each process in a level 1 DFD has its own Level 2 DFD in which the process is described in some detail. Some complex systems may require Level 3 DFDs for certain level 2 processes.
What do Data Flow Diagrams Consist of?
Data Flow Diagrams at all levels consist of 4 components, i.e. External Entities, Data Flows, Processes and Data Stores.
An external entity is a person, organisation, department, computer system or anything else which either sends data into a system (sometimes called a source) or which receives data from a system (sometimes called a sink) but which for the purposes of the project in question are outside the scope of the system itself. External entities (in the SSADM scheme) are represented as ovals containing the name of the external entity and a unique alphabetic identifier. slide 5
Tutor Guidance A useful technique in this area which is described in Goodland and Slater (1995) is the composition and decomposition of external entities, whereby an 'organisation' level external in a context diagram, can be broken down into 'department' level externals at level 1 and subsequently 'user role' externals at level 2. This fairly simplistic approach may help leaarners to pitch their DFD's at the right level. |
A data flow is a route by which data may travel from one element of a DFD to another. Data Flows are represented by arrows which are labelled with a simple meaningful name. slide 6
Processes are transformations which change incoming data flows into outgoing data flows. Processes are represented as rectangles which contain a simple description of the process. Each process has a unique reference number. In the early stages it is possible to show where in the organisation the process takes place, however this is a physical constraint imposed by the existing system and should not appear in a completed ‘logical’ data flow diagram. slide 7
A data store is a repository for data. A data store is represented by an open ended rectangle containing the name of the data store (usually a plural noun such as customers), each data store has a unique reference number prefixed by the letter D. slide 8
Valid Data Flows
Invalid Data Flows
How are DFDs Constructed?
Again there are no hard and fast rules and many re-drafts will be necessary as; the analysts understanding improves, new requirements are identified and the DFM is validated against the LDS. The following steps may be useful:-
- Establish the major inputs and outputs of the system, their sources, recipients and represent them in a context diagram (a context diagram is simply a very high level DFD which represents the entire system as one process)
- Establishing processes which handle data flows on their arrival into the system, and generate output data flows
- Identify the data stores which are required to link the input and output processes, i.e. the data stores which need to be read from and written to
- Rationalise the Level 1 DFD so that it includes 6 - 10 processes (this may be done by combining and/or splitting processes).
- For each level 1 process draw a level 2 DFD and if necessary draw any level 3 DFDs required.
- Review the entire DFD set against the identified requirements and re-draft if necessary.
Tutor Guidance In step 2 it maybe useful to uniquely number each data flow and consider which flows can logically be dealt with by which processes. |
This is a top down approach to Data Flow Modelling, alternatively you can work bottom up by identifying low level processes and grouping them. The approach is not important and often a hybrid approach is taken, what is important is that at the end of the exercise you have a consistent set of DFDs.
Tutorial Sheet: Data Flow Diagrams
Objectives
To model the LRC system using Data Flow Diagrams.
Scenario
The Learning Resources Centre at the University of Glamorgan (LRC) services the requirements of students, staff and external members who are allowed to reserve books and take books out on loan.
The following activities are currently supported by manual procedures;
- Reserving a book involves filling in a form and passing to LRC staff. The member will then be notified by mail when a copy of the book reserved comes in. Taking a book out on loan simply involves taking the book and your library card to the issue/return desk where the information is recorded. Returns involve handing the books back to the desk and paying any fines due.
- If books are overdue the LRC sends a reminder to the member concerned, the member will be warned that they cannot take any more books out of the LRC until they returned the overdue book and paid any fines due.
- The LRC stocks many book titles and for each book title there may be many copies. Copies may be on loan or on the shelves (either on short loan, 1 week loan or 3 week loan). The system maintains details of all book titles and copies.
- Information is held on book copies for six months following the physical removal of the copy. Information on book titles is removed when the information regarding the last copy of that book has been removed.
Task 1
Establish the major inputs and outputs of the system, their sources and recipients and represent them using a context diagram.
Task 2
Uniquely identify each data flow with a number and create a table as follows: The verbs used are not intended to be prescriptive, rather to indicate the kind of verbs to be used. Woolly phrases, such as 'process' show that the level of understanding is insufficient.
Process Number | Process Name (Concise, Descriptive) | Data Flows Supported |
1. | Create | 1,2 |
2. | Validate | 4 |
3. | Update | 7,9,10 |
4. | Collate | 3,5 |
5. | Delete | 6,8,11 |
6. | etc. | 12 |
NB At this stage try to use no more than 6 processes. (My guidelines suggest 6 - 10, however, it is far more likely that a process will have been omitted in the early stages and subsequently has to be included than vice versa
Task 3
Draw a first draft of the level 1 DFD ensuring that the externals are at the extermities of the diagram.
Task 4
Identify the data stores which are required to link the input and output processes and re-draft the diagram.
Task 5
Select a level 1 process and draw a level 2 DFD for it (in a real situation you would obviously have to do this for every process in the level 1 DFD).
Task 7
Review the entire DFD set against the identified requirements and redraft if necessary. A basic quality assurance check is to ensure that each data store has at least one input/update data flow and at least one enquiry/report/transmission data flow.
Learner Guidance If you want further information, the school of computing has developed a Computer Based Learning package supporting the Normalisation technique. This package will eventually be available through the web but for now, locate the G:\Library\CBL\Normal folder (using 'my computer' ) and double click on the Normmen3.app icon. |
Normalisation is defined briefly but accurately in the following statement;
‘The Key the Whole Key and Nothing but the Key (so help me Codd)’
Typically the literature on normalisation covers many levels of normalisation, 9 is not uncommon, but this seems to me to be a race amongst academics to identify as many levels as possible, in 99 cases out of 100, 3 levels of normalisation are all that is required.
1st Normal Form; converting an un-normalised data structure such as a report or an order form into 1st Normal Form (1NF) is commonly referred to as removing repeating groups but also may involve removing complex groups such as the Address Group described in rule 2 (see chapter 5). The aim is to ensure that each item is atomic.
2nd Normal Form; Converting a 1NF data structure into 2nd Normal Form (2NF) involves looking at each non-primary key attribute and ensuring that it depends on the whole of the key and not just part of it.
3rd Normal Form; Converting a 2NF data structure into 3rd Normal Form (3NF) involves looking at the interrelationships between non key attributes to see if any non key attributes depend only on each other.
This is all best described by looking at an example. Consider the following table which has been built up by an order entry clerk;
Cust# | Name | Ord# | Date | Part# | Desc | Qty | Price | Supp# | Name |
1 | Tim | 123 | 20/3 | 1 | AA | 2 | 1.99 | 23 | ABC |
2 | BB | 3 | 2.99 | 23 | ABC | ||||
3 | CC | 4 | 3.99 | 24 | DEF | ||||
456 | 21/3 | 4 | DD | 5 | 4.99 | 25 | GHI | ||
5 | EE | 6 | 5.99 | 26 | JKL | ||||
2 | John | 789 | 21/3 | 4 | DD | 7 | 3.99 | 25 | GHI |
6 | FF | 8 | 6.99 | 27 | MNO |
This table structure could be implemented quite easily in Cobol or in a network DBMS as shown in chapter 5, with all the associated problems.
A common representation of this kind of table in text books is as follows;
CUSTOMERS(Customer_Number, Customer_Name, (Order_Number, Order_Date, (Part_Number, Part_Description, Part_Quantity, Part_Price, Supplier_Number, Supplier_Name))
The internal brackets are meant to represent repeating groups and the underline represents a primary key. This called an un-normalised or 0NF data structure.
There are two approaches converting this 0NF structure to 1NF the first involves replicating the values in the table as follows;
Cust# | Name | Ord# | Date | Part# | Desc | Qty | Price | Supp# | Name |
1 | Tim | 123 | 20/3 | 1 | AA | 2 | 1.99 | 23 | ABC |
1 | Tim | 123 | 20/3 | 2 | BB | 3 | 2.99 | 23 | ABC |
1 | Tim | 123 | 20/3 | 3 | CC | 4 | 3.99 | 24 | DEF |
1 | Tim | 456 | 21/3 | 4 | DD | 5 | 4.99 | 25 | GHI |
1 | Tim | 456 | 21/3 | 5 | EE | 6 | 5.99 | 26 | JKL |
2 | John | 789 | 21/3 | 4 | DD | 7 | 3.99 | 25 | GHI |
2 | John | 789 | 21/3 | 6 | FF | 8 | 6.99 | 27 | MNO |
However this seems to be a clumsy approach and results in a three part key consisting of Cust#, Ord# and Part#. A simpler approach is to separate the repeating groups out into separate tables.
Step 1 remove the repeating group of orders
CUSTOMERS(Customer_Number, Customer_Name)
ORDERS(Order_Number, Customer_Number*, Order_Date, (Part_Number, Part_Description, Part_Quantity, Part_Price, Supplier_Number, Supplier_Name))
Step 2 remove the repeating group of parts
CUSTOMERS(Customer_Number, Customer_Name)
ORDERS(Order_Number, Customer_Number*, Order_Date)
ORDER_PARTS(Part_Number, Order_Number*, Part_Description, Part_Quantity, Part_Price, Supplier_Number, Supplier_Name)
The structure is now in 1NF since there are no repeating or complex group items (each item depends on the key). The next step is to convert the structure into 2NF, by examining each non primary key attribute to ensure that each depends on the whole of the key.
The CUSTOMERS and ORDERS tables each have a single column making up their primary key and are therefore by definition in 2NF. However looking at the ORDER_PARTS table it can be seen that Part_Description, Part_Price, Supplier_Number and Supplier Name only depend on Part_Number, i.e. their values are the same regardless of Order_Number. (Part_Quantity depends on the whole of the key since different quantities can appear on different orders.) To convert to 2NF a separate table is created for part descriptions, prices ,and supplier details
CUSTOMERS(Customer_Number, Customer_Name)
ORDERS(Order_Number, Customer_Number*, Order_Date)
ORDER_PARTS(Part_Number, Order_Number*, Part_Quantity)
PARTS(Part_Number, Part_Description, Part_Price, Supplier_Number, Supplier_Name)
The structures are now in 2NF since every non-primary key attribute depends on the whole of the key. The next step is to convert the structure into 3NF by ensuring that each non-primary key attribute depends on nothing but the key.
The CUSTOMERS table is patently in 3NF because there is no non-primary key attribute for Customer_Name to depend on. The ORDERS table is in 3NF because there is no dependency between Order_Date and Customer_Number (a customer can place different orders on different dates). The ORDER_PARTS table is in 3NF because the quantity ordered is dependent on both the order number and the part number. Looking however at the PARTS table it can be seen that the Supplier_Name attribute depends on the Supplier_Number and has nothing to do with the part number. To convert the structure into 3 NF a separate table is created containing supplier details.
CUSTOMERS(Customer_Number, Customer_Name)
ORDERS(Order_Number, Customer_Number*, Order_Date)
ORDER_PARTS(Part_Number, Order_Number*, Part_Quantity)
PARTS(Part_Number, Supplier_Number*, Part_Description, Part_Price)
SUPPLIERS(Supplier_Number, Supplier_Name)
Tutor Guidance The best way I have come across is to use a real document such as an order form, a printed report or a screen dump of a transaction, and to make a mistake when carrying out the normalisation process. |
No comments:
Post a Comment