Details On COMPILERS



A compiler is a computer program (or set of programs) that transforms source code written in a programming language(the source language) into another computer language (the target language, often having a binary form known as object code). The most common reason for wanting to transform source code is to create an executable program.The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language (e.g., assembly language or machine code). If the compiled program can only run on a computer whose CPU or operating system is different from the one on which the compiler runs the compiler is known as a cross-compiler. A program that translates from a low level language to a higher level one is a decompiler. A program that translates between high-level languages is usually called a language translator, source to source translator, or language converter. A language rewriter is usually a program that translates the form of expressions without a change of language. A compiler is likely to perform many or all of the following operations: lexical analysis, preprocessing, parsing, semantic analysis(Syntax-directed translation), code generation, and code optimizationProgram faults caused by incorrect compiler behavior can be very difficult to track down and work around and compiler implementors invest a lot of time ensuring the correctness of their software. The term compiler-compiler is sometimes used to refer to a parser generator, a tool often used to help create the lexer and parser.

PURPOSE
The purpose of the compiler is to translate the programs written in human readable language to machine language which is understandable to computer machine. A compiler reads instruction in the programs and translates it, if there is any error it also identify it and warns about it to user. The error could be typing mistake (syntax error) or a logical error. when all the erros are removed the instructions are sent to the computer for processing.


SCOPE
·        The portion of a program in which a particular identifier is visible.                                Different programming languages have different scoping patterns.
·        Scoping refers o identifiers; this includes variable names, constant names, function names, procedure names, programs names, etc.
·        A declaration is a binding occurence.
·        When a programming language does not require a declaration, the first applied occurence of an identifier is also a binding occurence, with attributes being inferred from the usage.
·        The scope of an identifier may not be the same as its lifetime; the two attributes are distinct.

2. Feasibility study
2.1 Technical feasibility:
This project is intended to Turbo C .As long as Turbo  C is available this project will run smoothly provided all the in consistencies are removed.
2.2 Economic feasibility
The cost required to develop the project lies mainly in the licensing of the software plus the manpower rquirements. Since the project is intended to be developed on Open Source Systems the main financial constrints lies in the manpower requirements. If provided required  manpower the project is feasible within time limits according to the customer.
Each phase of the life cycle has a cost.
·                     Personnel
·                     Equipment
·                     Technology
·                     Consultants' fees
·                     Hidden cost
According to above our project is Economically feasible because we have affordable personnel and hidden cost,equipments and technologies are available and we don’t have any consultant’s fee.
2.3 Time constraints
If provided enough manpower resourses the project can be developed within time limits.

Software  Requirements  Specifications  Document

1.   Introduction
The following subsections of the Software Requirements Specifications (SRS) document should provide an overview of the entire SRS. The thing to keep in mind as you write this document is that you are telling what the system must do – so that designers can ultimately build it. Do not use this document for design!!!


1.1   Purpose
Identify the purpose of this SRS and its intended audience. In this subsection, describe the purpose of the particular SRS and specify the intended audience for the SRS.


1.2   Scope
In this subsection:

(1)   Identify the software product(s) to be produced by name

(2)   Explain what the software product(s) will, and, if necessary, will not do

(3) Describe the application of the software being specified, including relevant benefits, objectives, and goals

(4)   Be consistent with similar statements in higher-level specifications if they exist

This should be an executive-level summary. Do not enumerate the whole requirements list here.


1.3   Definitions, Acronyms, and Abbreviations.
Provide the definitions of all terms, acronyms, and abbreviations required to properly interpret the SRS. This information may be provided by reference to one or more appendices in the SRS or by reference to documents. This information may be provided by reference to an Appendix.


1.4   References
In this subsection:

(1)  Provide a complete list of all documents referenced elsewhere in the SRS

(2)  Identify each document by title, report number (if applicable), date, and publishing organization

(3)  Specify the sources from which the references can be obtained.

This information can be provided by reference to an appendix or to another document. If your application uses specific protocols or RFC’s, then reference them here so designers know where to find them.

1.5 Overview

In this subsection:

(1)  Describe what the rest of the SRS contains

(2)  Explain how the SRS is organized

Don’t rehash the table of contents here. Point people to the parts of the document they are most concerned with. Customers/potential users care about section 2, developers care about section 3.

2.   The Overall  Description
Describe the general factors that affect the product and its requirements. This section does not state specific requirements. Instead, it provides a background for those requirements, which are defined in section 3, and makes them easier to understand. In a sense, this section tells the requirements in plain English for the consumption of the customer. Section3 will contain a specification written for the developers.


2.1   Product Perspective
Put the product into perspective with other related products. If the product is independent and totally self-contained, it should be so stated here. If the SRS defines a product that is a component of a larger system, as frequently occurs, then this subsection relates the requirements of the larger system to functionality of the software and identifies interfaces between that system and the software. If you are building a real system,compare its similarity and differences to other systems in the marketplace. If you are doing a research-oriented project, what related research compares to the system you are planning to build.

A block diagram showing the major components of the larger system, interconnections, and external interfaces can be helpful. This is not a design or architecture picture. It is more to provide context, especially if your system will interact with external actors. The system you are building should be shown as a black box. Let the design document present the internals.

The following subsections describe how the software operates inside various constraints.


2.1.1 System Interfaces
List each system interface and identify the functionality of the software to accomplish the system requirement and the interface description to match the system. These are external systems that you have to interact with. For instance, if you are building a business application that interfaces with the existing employee payroll system, what is the API to that system that designer’s will need to use?

2.1.2 Interfaces
Specify:

(1)  The logical characteristics of each interface between the software product and its users.

(2)  All the aspects of optimizing the interface with the person who must use the system


2.1.3 Hardware Interfaces
Specify the logical characteristics of each interface between the software product and the hardware components of the system. This includes configuration characteristics. It also covers such matters as what devices are to be supported, how they are to be supported and protocols. This is not a description of hardware requirements in the sense that “This program must run on a Mac with 64M of RAM”. This section is for detailing the actual hardware devices your application will interact with and control. For instance, if you are controlling X10 type home devices, what is the interface to those devices? Designers should be able to look at this and know what hardware they need to worry about in the design. Many business type applications will have no hardware interfaces. If none, just state “The system has no hardware interface requirements” If you just delete sections that are not applicable, then readers do not know if: a. this does not apply or b. you forgot to include the section in the first place.

2.1.4 Software Interfaces
Specify the use of other required software products and interfaces with other application systems. For each required software product, include:

(1)  Name

(2)  Mnemonic

(3)  Specification number

(4)  Version number

(5)  Source


2.1.5 Communications Interfaces
Specify the various interfaces to communications such as local network protocols, etc. These are protocols you will need to directly interact with. If you happen to use web services transparently to your application then do not list it here. If you are using a custom protocol to communicate between systems, then document that protocol here so designers know what to design. If it is a standard protocol, you can reference an existing document or RFC.


2.1.6 Memory Constraints
Specify any applicable characteristics and limits on primary and secondary memory. Don’t just make up something here. If all the customer’s machines have only 128K of RAM, then your target design has got to come in under 128K so there is an actual requirement. You could also cite market research here for shrink-wrap type applications “Focus groups have determined that our target market has between 256-512M of RAM, therefore the design footprint should not exceed 256M.” If there are no memory constraints, so state.

2.1.7 Operations
Specify the normal and special operations required by the user such as:

(1)  The various modes of operations in the user organization

(2)  Periods of interactive operations and periods of unattended operations

(3)  Data processing support functions

Backup and recovery operations

2.1.8 Site Adaptation  Requirements

In this section:

(1) Define the requirements for any data or initialization sequences that are specific to a given site, mission, or operational mode

(2) Specify the site or mission-related features that should be modified to adapt the software to a particular installation

If any modifications to the customer’s work area would be required by your system, then document that here. For instance, “A 100Kw backup generator and 10000 BTU air conditioning system must be installed at the user site prior to software installation”.

This could also be software-specific like, “New data tables created for this system must be installed on the company’s existing DB server and populated prior to system activation.” Any equipment the customer would need to buy or any software setup that needs to be done so that your system will install and operate correctly should be documented here.


2.2   Product Functions
Provide a summary of the major functions that the software will perform. Sometimes the function summary that is necessary for this part can be taken directly from the section of the higher-level specification (if one exists) that allocates particular functions to the software product.

For clarity:

(1) The functions should be organized in a way that makes the list of functions understandable to the customer or to anyone else reading the document for the first time.

(2) Textual or graphic methods can be used to show the different functions and their relationships. Such a diagram is not intended to show a design of a product but simply shows the logical relationships among variables.

2.3   User Characteristics
Describe those general characteristics of the intended users of the product including educational level, experience, and technical expertise. Do not state specific requirements but rather provide the reasons why certain specific requirements are later specified in section 3.

What is it about your potential user base that will impact the design? Their experience and comfort with technology will drive UI design. Other characteristics might actually influence internal design of the system.



2.4   Constraints
Provide a general description of any other items that will limit the developer's options. These can include:

(1)  Regulatory policies

(2)  Hardware limitations (for example, signal timing requirements)

(3)  Interface to other applications

(4)  Parallel operation

(5)  Audit functions

(6)  Control functions

(7)  Higher-order language requirements

(8) Signal handshake protocols (for example, XON-XOFF, ACK-NACK)

(9) Reliability requirements

(10) Criticality of the application

(11)                       Safety and security considerations

This section captures non-functional requirements in the customers language. A more formal presentation of these will occur in section 3.


2.5 Assumptions and Dependencies
List each of the factors that affect the requirements stated in the SRS. These factors are not design constraints on the software but are, rather, any changes to them that can affect the requirements in the SRS. For example, an assumption might be that a specific operating system would be available on the hardware designated for the software product. If, in fact, the operating system were not available, the SRS would then have to change accordingly.

This section is catch-all for everything else that might influence the design of the system and that did not fit in any of the categories above.


2.6 Apportioning of Requirements.
Identify requirements that may be delayed until future versions of the system. After you look at the project plan and hours available, you may realize that you just cannot get everything done. This section divides the requirements into different sections for development and delivery. Remember to check with the customer – they should prioritize the requirements and decide what does and does not get done. This can also be useful if you are using an iterative life cycle model to specify which requirements will map to which interation.


3.   Specific Requirements

This section contains all the software requirements at a level of detail sufficient to enable designers to design a system to satisfy those requirements, and testers to test that the system satisfies those requirements. Throughout this section, every stated requirement should be externally perceivable by users, operators, or other external systems. These requirements should include at a minimum a description of every

input (stimulus) into the system, every output (response) from the system and all functions performed by the system in response to an input or in support of an output. The following principles apply:

(1)  Specific requirements should be stated with all the characteristics of a good SRS

         correct

         unambiguous

         complete

         consistent

         ranked for importance and/or stability

         verifiable

         modifiable

         traceable

(2)  Specific requirements should be cross-referenced to earlier documents that relate

(3)  All requirements should be uniquely identifiable (usually via numbering like 3.1.2.3)

(4)  Careful attention should be given to organizing the requirements to maximize readability (Several alternative organizations are given at end of document)

Before examining specific ways of organizing the requirements it is helpful to understand the various items that comprise requirements as described in the following subclasses. This section reiterates section 2, but is for developers not the customer. The customer buys in with section 2, the designers use section 3 to design and build the actual application.

Remember this is not design. Do not require specific software packages, etc unless the customer specifically requires them. Avoid over-constraining your design. Use proper terminology:

The system shall…   A required, must have feature

The system should… A desired feature, but may be deferred til later

The system may… An optional, nice-to-have feature that may never make it to implementation.

3.1 External Interfaces
This contains a detailed description of all inputs into and outputs from the software system. It complements the interface descriptions in section 2 but does not repeat information there. Remember section 2 presents information oriented to the customer/user while section 3 is oriented to the developer.

It contains both content and format as follows:

         Name of item

         Description of purpose

         Source of input or destination of output

         Valid range, accuracy and/or tolerance

         Units of measure

         Timing

         Relationships to other inputs/outputs

         Screen formats/organization

         Window formats/organization

         Data formats

         Command formats

         End messages


3.2 Functions
Functional requirements define the fundamental actions that must take place in the software in accepting and processing the inputs and in processing and generating the outputs. These are generally listed as “shall” statements starting with "The system shall…

These include:

         Validity checks on the inputs

         Exact sequence of operations

         Responses to abnormal situation, including

         Overflow

         Communication facilities

         Error handling and recovery

         Effect of parameters

         Relationship of outputs to inputs, including

         Input/Output sequences

         Formulas for input to output conversion



It may be appropriate to partition the functional requirements into sub-functions or sub-processes. This does not imply that the software design will also be partitioned that way.

3.3 Performance Requirements
This subsection specifies both the static and the dynamic numerical requirements placed on the software or on human interaction with the software, as a whole. Static numerical requirements may include:

(a)   The number of terminals to be supported

(b)  The number of simultaneous users to be supported

(c)   Amount and type of information to be handled

Static numerical requirements are sometimes identified under a separate section entitled capacity.

Dynamic numerical requirements may include, for example, the numbers of transactions and tasks and the amount of data to be processed within certain time periods for both normal and peak workload conditions.

3.4 Logical Database Requirements
This section specifies the logical requirements for any information that is to be placed into a database. This may include:

         Types of information used by various functions

         Frequency of use

         Accessing capabilities

         Data entities and their relationships

         Integrity constraints

         Data retention requirements

If the customer provided you with data models, those can be presented here. ER diagrams (or static class diagrams) can be useful here to show complex data relationships. Remember a diagram is worth a thousand words of confusing text.


3.5 Design Constraints
Specify design constraints that can be imposed by other standards, hardware limitations, etc.


3.5.1   Standards Compliance
Specify the requirements derived from existing standards or regulations. They might include:

(1)  Report format

(2)  Data naming

(3)  Accounting procedures

(4)  Audit Tracing

For example, this could specify the requirement for software to trace processing activity. Such traces are needed for some applications to meet minimum regulatory or financial standards. An audit trace requirement may, for example, state that all changes to a payroll database must be recorded in a trace file with before and after values.


3.6 Software System Attributes
There are a number of attributes of software that can serve as requirements. It is important that required attributes by specified so that their achievement can be objectively verified. The following items provide a partial list of examples. These are also known as non-functional requirements or quality attributes.

These are characteristics the system must possess, but that pervade (or cross-cut) the design. These requirements have to be testable just like the functional requirements. Its easy to start philosophizing here, but keep it specific.


3.6.1 Reliability
Specify the factors required to establish the required reliability of the software system at time of delivery. If you have MTBF requirements, express them here. This doesn’t refer to just having a program that does not crash. This has a specific engineering meaning.


3.6.2 Availability
Specify the factors required to guarantee a defined availability level for the entire system such as checkpoint, recovery, and restart. This is somewhat related to reliability. Some systems run only infrequently on-demand (like MS Word). Some systems have to run 24/7 (like an e-commerce web site). The required availability will greatly impact the design. What are the requirements for system recovery from a failure? “The system shall allow users to restart the application after failure with the loss of at most 12 characters of input”.

SECURITY  MEASURE
Obfuscated code is source or machine code that has been made difficult to understand. Programmers may deliberately obfuscate code to conceal its purpose or its logic to prevent tampering, deter reverse engineering or as a puzzle or recreational challenge for readers. It is a form of security through obscurity. Programs known as obfuscators transform readable code into obfuscated code using various techniques that might induce anti-debugging, anti-decompilation and anti-disassembly mechanism. Code obfuscation is different in essence from hardware obfuscation, where description and/or structure of a circuit is modified to hide its functionality.

Obfuscating code to prevent reverse engineering is typically done to manage risks that stem from unauthorised access to source code. These risks include loss of intellectual property, ease of probing for application vulnerabilities and loss of revenue that can result when applications are reverse engineered, modified to circumvent metering, security logic or usage control and then recompiled. Obfuscating code is, therefore, also a compensating control to manage these risks.

Obfuscation by code morphing refers to obfuscating machine language or object code rather than obfuscating the source code.
This is achieved by completely replacing a section of the compiled code with an entirely new block that expects the same machine state when it begins execution as the previous section, and will leave with the same machine state after execution as the original. However, a number of additional operations will be completed as well as some operations with an equivalent effect.
Code morphing makes disassembly of a distributed program more difficult. However, by adding unnecessarily complicated operations and hindering compiler-made optimizations, the execution time of the program is increased. For that reason, code morphing should be limited to critical portions of a program and not be used on an entire application.
Code morphing is often used in obfuscating the copy protection or other checks that a program makes to determine whether it is a valid, authentic installation, or a pirated copy, in order to make the removal of the copy-protection code more difficult than would otherwise be the case.
Code morphing is multilevel technology containing hundreds of unique code transformation patterns, as well as a special layer that transforms some commands into virtual machine commands. Code morphing turns binary code into an undecipherable mess that is not similar to normal compiled code.

Advantages of obfuscation
·        Intellectual property protection
·        Obfuscation is typically used to protect the intellectual property that is present in a software. This includes protecting any trade secrets that may be present in the code as well as protecting licensing implementations to prevent unauthorized use.
·        Reduced security exposure
·        If an application has private information in the code – such as SQL, usernames, and passwords – then obfuscating code with options such as string encryption can make this information harder to obtain.

Size reduction
·        As one of the core techniques of obfuscation is identifier renaming, size reductions are often gained by changing long, descriptive identifiers into, typically, one character identifiers. This can lead to substantial savings in program size, albeit with a resulting loss of maintainability of the code. Many obfuscators also have the ability to remove unused code, leading to further reductions in size.
·        Library linking
·        To fully hide the intent of an obfuscated program, it is common for standard library routines to be statically linked into the obfuscated program and those routines themselves are then obfuscated. This can be useful for avoiding problems like DLL hell.

Disadvantages of obfuscation
·        At best, obfuscation merely makes it time-consuming, but not impossible, to reverse engineer a program. When security is important, measures other than obfuscation should be used. The same trade-offs are made in branches of cryptography: an algorithm may be known to be fast but weak, but if the information is very short-lived there is little incentive, except as an intellectual exercise, for anyone to break it: the information becomes useless before it is broken.
·        No one can guarantee that obfuscation will present any particular level of difficulty to a reverse engineer.
·        Obfuscators do not provide security of a level similar to modern encryption schemes. Even obfuscation with encryption can have flaws. Any program or data that is encrypted must be decrypted before it can be used by the computer. So it must exist, unencrypted, somewhere in memory; a reverse engineer can take a snapshot of that memory. Also, any strong encryption requires a key for decryption. For the program to be executable the key must be provided, leaving another avenue open for reverse engineering
·        Debuggers also provide powerful tools to look at the operation of a running program, and decompilers exist to help make bytecode and machine code human-readable. Reverse engineers can use these tools to crack systems with security dependant on obfuscation, including digital rights management systems.
·        Reverse engineering is partly a study in pattern recognition, and the good engineer quickly learns the quirks of a particular compiler, processor, or even programmer, and can make educated guesses about the original code.
·        Debugging
·        Obfuscated code is extremely difficult to debug. Variable names will no longer make sense, and the structure of the code itself will likely be modified beyond recognition. This fact generally forces developers to maintain two builds: One with the original, unobfuscated source code that can be easily debugged, and another for release. While both builds should be tested to make sure they perform identically, the second build is generally reliably constructed from the first by an obfuscator.
·        This limitation does not apply to intermediate language obfuscators (e.g. for .NET and Java), which generally work on compiled assemblies rather than on source code.
·        [edit]Portability
·        Obfuscated code often depends on the particular characteristics of the platform and compiler, making it difficult to manage if either change. This only applies to source code obfuscation. Obfuscation against intermediate languages does not have this limitation, though obfuscation can make it harder or impossible to decompile to a higher level language such as C# or Java.
·        [edit]Conflicts with Reflection APIs
·        Reflection is a set of APIs in various languages that allow an object to be examined or created just by knowing its classname at run-time. Many obfuscators allow specified classes to be exempt from renaming; and it is also possible to let a class be renamed and call it by its new name. However, the former option places limits on the dynamism of code, while the latter adds a great deal of complexity and inconvenience to the system.
·        [edit]Obfuscating software

·        A vast variety of tools exists to perform or assist with code obfuscation. These include experimental research tools created by academics, hobbyist tools, commercial products written by professionals, and Open-source software. Deobfuscators do the reverse.
·        Software obfuscation tools include specialized obfuscators to demonstrate a relatively limited technique, more general obfuscators which attempt a more thorough obfuscation, and combined-function tools which obfuscate code as part of a larger goal such as software licensing enforcement.

Validation
·        Compiler Testing Challenges
·        There are unique challenges in testing compilers. This is due to several key issues:
·        Testing is difficult due to the complexity of the compiler since the source includes many algorithms involving instruction scheduling, register allocation, software pipelining, vector optimization, and so on.
·        The test input domain is virtually infinite. Even if some inputs are erroneous or meaningless, the compiler must correctly handle those inputs without fatal internal errors, and must report reasonable messages back to the user.
·        In addition to these general compiler challenges, there are additional challenges in testing the TI compiler.
·        The TI compiler must be tested across multiple targets and multiple option combinations.
·        The TI compiler supports multiple execution platforms.
·        The TI compiler supports a long-lived code base, in some cases extending to 25 or more years.
·        To overcome these challenges, the compiler command-line shell tool, the C I/O run-time libraries, and TI command-line simulators allows for batch processing of compiler testing. TI has developed extensive automation to handle the batch testing of the compiler.
·        Robustness Mechanisms
·        There are several mechanisms in place to address compiler robustness at TI. Some key points:
·        Re-use: The compiler supports a minimum of 6 different target architectures. There is extensive re-use of compiler technology across these targets in each compiler release. Some details:
·        About 60% of the code base is target-independent.
·        For any given compiler about 90% is target-independent.
·        On average each line of source code is re-used 6 times.
·        Test code is also re-used. Most tests developed for one compiler are leveraged for use with all targets.
·        Peer Reviews: Development of compiler technology is handled through several review cycles before being added to the source base.
·        Each new feature requires a functional review.
·        Each new feature requires a separate design review.
·        Prior to a code review, new features require a code overview where implementation details are analyzed and decided.
·        Each change to the code base, even minor, is review by at least one other developer. Code reviews are performed on-line using [Code Collaborator].
·        Check-ins to the code base are prevented until the reviews are completed.
·        Analysis Tools: Several analysis tools are used both during development and during testing.
·        [Purify]: Each release is checked for memory leaks, buffer-overruns, heap corruption, dangling pointers, etc.
·        [PureCoverage], [Gcov]: Coverage data is collected on each release and correlated by module, tool, and target.
·        Coverage data is stored in a database for tracking and history.
·        Automation: The compiler build and validation process is entirely automated. The input to a compiler release validation includes a test plan and the final output of the validation is a tar file of all validation results as well as a summary report. Some automation details:
·        The build and validation automation controls nightly builds and validations, as well as all release validations.
·        [LSF] is used to leverage thousands of hosts to run the validations in parallel.
·        Nightly builds/validations are run on thousands of hosts with a 12-16 hour running time.
·        Release builds/validations are run on thousands of hosts with a 48-72 hour running time.
·        Compiler Validations
·        TI compiler validation is performed at 3 different stages.
·        Prior to check-in: A "pre-commit validation" must successfully pass in order for a change to be allowed into the code base. This pre-commit validation contains a sampling of test suites designed to provide almost full coverage.
·        Nightly validation: This tests a nightly build which incorporates all committed and approved changes from the previous day. The nightly validation must successfully pass in order for any new changes are allowed into the code base. This validation includes the complete test suites, expanded option combination testing, and random option combinations. In addition, the nightly validation will test interactions between all source updates from the previous day.
·        Release validation: A release validation includes all complete test suites and each suite is tested with extensive option sets. These option sets will include hundreds of combinations. A release validation will compile over 200,000 individual executable, self-checking test cases.

No comments:

Post a Comment

leave your opinion