EducAtioN bLog

Tuesday, March 16, 2010

Characterizing Browsing Strategies in the World-Wide Web

Abstract

This paper presents the results of a study conducted at Georgia Institute of Technology that captured client-side user events of NCSA's XMosaic. Actual user behavior, as determined from client-side log file analysis, supplemented our understanding of user navigation strategies as well as provided real interface usage data. Log file analysis also yielded design and usability suggestions for WWW pages, sites and browsers. The methodology of the study and findings are discussed along with future research directions.
Keywords

Hypertext Navigation, Log Files, User Modeling
Introduction

With the prolific growth of the World-Wide Web (WWW) [Berners-Lee et.al, 1992] in the past year there has been an increased demand for an understanding of the WWW audience. Several studies exist that determine demographics and some behavioral characteristics of WWW users via self-selection [Pitkow and Recker 1994a & 1994b]. Though highly informative, such studies only provide high level trends in Web use (e.g. frequency of Web browser usage to access research reports, weather information, etc). Other areas of audience analysis, such as navigation strategies and interface usage remain unstudied. Thus, the surveys provide estimations of who is using the WWW, but fail to provide detailed information on exactly how the Web is being used. Actual user behavior, as determined from client-side log file analysis, can supplement the understanding of Web users with more concrete data. Log file analysis also yields design and usability guidelines for WWW pages, sites and browsers.
This paper presents the results of a three week study conducted at Georgia Institute of Technology that captured client-side user events of NCSA's XMosaic. Specifically, the paper will first present a review of related hypertext browsing and searching literature and how it's related to the Web, followed by a description of the study's methodology. An analysis of user navigation patterns ensues. Lastly, a discussion and recommendations for document design are presented.

Literature Review

Many studies have addressed user strategies and usability of closed hypermedia systems, databases and library information systems [Caramel et. al., 1992]. Most distinguish between browsing and searching. Cove and Walsh [Cove et. al. 1988] include a third browsing strategy:
Search browsing; directed search; where the goal is known
General purpose browsing; consulting sources that have a high likelihood of items of interest
Serendipitous browsing; purely random
This continuum provides a nice middle ground to distinguish between browsing as a method of completing a task and open ended browsing with no particular goal in mind. Marchionini [Marchionini, 1989] further develops this distinction in designating open and closed tasks. Closed tasks have a specific answer and often integrate subgoals. Open tasks are much more subject oriented and less specific. Browsing can be used as a method of fulfilling either open or closed tasks.
Intuitively, it would seem that browsing and searching are not mutually exclusive activities. In Bates's [Bates, 1989] work on berrypicking, a user's search strategy is constantly evolving through browsing. Users often move back and forth between strategies. Similarly, Bieber and Wan [Bieber & Wan, 1994] discuss the use of backtracking within a multi-windowed hypertext environment. They introduce the concept of "task-based backtracking," in which a user backtracks to compare information from different sources for the same task or to operate two tasks simultaneously. A similar technique, in a Web environment, would be backtracking to review previously retrieved pages.

All of these studies were performed on closed, single-author systems. The WWW however, is an open, collaborative and exceedingly dynamic hypermedia system. These previous findings provide the basis and structure for the describing the ways a user population behaves in a dynamic information ecology, like the WWW.

Given that we expect to find the same kinds of strategies used in the WWW, supporting both the browser and the searcher in designing WWW pages and servers is necessary, although difficult. Furthermore, supporting the kind of task switching described by Bates and Beiber and Wan adds another level of complexity because the work implies that a user should be able to switch strategies at any time.

It has long been recognized that methods for supporting directed searching are needed. As a response to this, certain WWW servers are completely searchable and there are World-Wide Web search engines available.

Supporting browsing, though, may be a more difficult task. Both Laurel [Laurel, 1991] and Bernstein approach the topic of how to assess and design hypertexts for the browsing user. Laurel considers interactivity to be the primary goal. She defines a continuum for interactivity along three variables: frequency (frequency of choices), range (number of possible choices) and significance (implication of choices). Laurel contends that users will pay the price "often enthusiastically -- in order to gain a kind of lifelikeness, including the possibility of surprise and delight." Bernstein takes a slightly different approach with his "volatile hypertexts" [Bernstein, 1991]. He argues that the value of hypertext lies in its ability to create serendipitous connections between unexpected ideas.

There is a tension between designing for a browser and designing for a searcher. The logical hierarchy of a file structure or a searchable database may work fine for a closed-task, goal oriented user. But a user looking for the unexpected element or a serendipitous connection may be frustrated by the precision required by these methods. The first step in balancing this problem is to determine what strategies are being used by the population. In order to do this, we collected log files of users interacting with the Web.

History of internet

Before the wide spread of internetworking (802.1) that led to the Internet, most communication networks were limited by their nature to only allow communications between the stations on the local network and the prevalent computer networking method was based on the central mainframe computer model. Several research programs began to explore and articulate principles of networking between physically separate networks, leading to the development of the packet switching model of digital networking. These research efforts included those of the laboratories of Donald Davies (NPL), Paul Baran (RAND Corporation), and Leonard Kleinrock at MIT and at UCLA. The research led to the development of several packet-switched networking solutions in the late 1960s and 1970s, including ARPANET and the X.25 protocols. Additionally, public access and hobbyist networking systems grew in popularity, including unix-to-unix copy (UUCP) and FidoNet. They were however still disjointed separate networks, served only by limited gateways between networks. This led to the application of packet switching to develop a protocol for internetworking, where multiple different networks could be joined together into a super-framework of networks. By defining a simple common network system, the Internet Protocol Suite, the concept of the network could be separated from its physical implementation. This spread of internetworking began to form into the idea of a global network that would be called the Internet, based on standardized protocols officially implemented in 1982. Adoption and interconnection occurred quickly across the advanced telecommunication networks of the western world, and then began to penetrate into the rest of the world as it became the de-facto international standard for the global network. However, the disparity of growth between advanced nations and the third-world countries led to a digital divide that is still a concern today.
Following commercialization and introduction of privately run Internet service providers in the 1980s, and the Internet's expansion for popular use in the 1990s, the Internet has had a drastic impact on culture and commerce. This includes the rise of near instant communication by electronic mail (e-mail), text based discussion forums, and the World Wide Web. Investor speculation in new markets provided by these innovations would also lead to the inflation and subsequent collapse of the Dot-com bubble. But despite this, the Internet continues to grow, driven by commerce, greater amounts of online information and knowledge and social networking known as Web 2.0.

ARPANET

Len Kleinrock and the first IMP.
Promoted to the head of the information processing office at DARPA, Robert Taylor intended to realize Licklider's ideas of an interconnected networking system. Bringing in Larry Roberts from MIT, he initiated a project to build such a network. The first ARPANET link was established between the University of California, Los Angeles and the Stanford Research Institute on 22:30 hours on October 29, 1969. By December 5, 1969, a 4-node network was connected by adding the University of Utah and the University of California, Santa Barbara. Building on ideas developed in ALOHAnet, the ARPANET grew rapidly. By 1981, the number of hosts had grown to 213, with a new host being added approximately every twenty days.
ARPANET became the technical core of what would become the Internet, and a primary tool in developing the technologies used. ARPANET development was centered around the Request for Comments (RFC) process, still used today for proposing and distributing Internet Protocols and Systems. RFC 1, entitled "Host Software", was written by Steve Crocker from the University of California, Los Angeles, and published on April 7, 1969. These early years were documented in the 1972 film Computer Networks: The Heralds of Resource Sharing.
International collaborations on ARPANET were sparse. For various political reasons, European developers were concerned with developing the X.25 networks. Notable exceptions were the Norwegian Seismic Array (NORSAR) in 1972, followed in 1973 by Sweden with satellite links to the Tanum Earth Station and Peter Kirstein's research group in the UK, initially at the Institute of Computer Science, London University and later at University College London.

Leonard-Kleinrock-and-IMP1.png

Asynchronous vs. Synchronous

Asynchronous vs. Synchronous
Most communications circuits perform functions described in the physical and data link layer of the OSI Model. There are two general strategies for communicating over a physical circuit: Asynchronous and Synchronous. Each has it's advantages and disadvantages.

ASYNCHRONOUS

Asynchronous communication utilizes a transmitter, a receiver and a wire without coordination about the timing of individual bits. There is no coordination between the two end points on just how long the transmiter leaves the signal at a certain level to represent a single digital bit. Each device uses a clock to measure out the 'length' of a bit. The transmitting device simply transmits. The receiving device has to look at the incoming signal and figure out what it is receiving and coordinate and retime its clock to match the incoming signal.

Sending data encoded into your signal requires that the sender and receiver are both using the same enconding/decoding method, and know where to look in the signal to find data. Asynchronous systems do not send separate information to indicate the encoding or clocking information. The receiver must decide the clocking of the signal on it's own. This means that the receiver must decide where to look in the signal stream to find ones and zeroes, and decide for itself where each individual bit stops and starts. This information is not in the data in the signal sent from transmitting unit.

When the receiver of a signal carrying information has to derive how that signal is organized without consulting the transmitting device, it is called asynchronous communication. In short, the two ends do not synchronize the connection parameters before communicating. Asynchronous communication is more efficient when there is low loss and low error rates over the transmission medium because data is not retransmitted and no time is spent setting negotiating the connection parameters at the beginning of transmission. Asynchronous systems just transmit and let the far end station figure it out. Asynchronous is sometimes called "best effort" transmission because one side simply transmits, and the other does it's best to receive.

EXAMPLES:
Asynchronous communication is used on RS-232 based serial devices such as on an IBM-compatible computer's COM 1, 2, 3, 4 ports. Asynchronous Transfer Mode (ATM) also uses this means of communication. Your PS2 ports on your computer also use serial communication. This is the method is also used to communicate with an external modem. Asynchronous communication is also used for things like your computer's keyboard and mouse.

Think of asynchronous as a faster means of connecting, but less reliable.
SYNCHRONOUS

Synchronous systems negotiate the communication parameters at the data link layer before communication begins. Basic synchronous systems will synchronize both clocks before transmission begins, and reset their numeric counters for errors etc. More advanced systems may negotiate things like error correction and compression.

It is possible to have both sides try to synchronize the connection at the same time. Usually, there is a process to decide which end should be in control. Both sides can go through a lengthy negotiation cycle where they exchange communications parameters and status information. Once a connection is established, the transmitter sends out a signal, and the receiver sends back data regarding that transmission, and what it received. This connection negotiation process takes longer on low error-rate lines, but is highly efficient in systems where the transmission medium itself (an electric wire, radio signal or laser beam) is not particularly reliable.

Tuesday, February 2, 2010

Netiquette

Netiquette s a set of social conventions that facilitate interaction over networks, ranging from Usenet and mailing lists to blogs and forums. These rules were described IETF RFC 1855. however, like many internet phenomena the concept and its application remain in a state of flux, and vary from community to community. The points most strongly emphasized about USENET netiquette often include using simple electronic signatures and avoiding multiposting, cross-posting, off-topic posting, hijacking a discussion thread, and other techniques used to minimize the effort required to read a post or a thread. Netiquette guidelines posted by IBM for employees utilizing second life in an official capacity, however, focus on basic professionalism, maintaining a tenable work environment, and protecting IBM's intellectual property.

the history of netiquette began before the 1991 start of the World Wide web, Text-based email, Telnet, Gopher, Wais and FTP from educational and research bodies dominated Internet traffic. At that time, it was considered somewhat indecent to make commercial public postings, and the limitations of insecure, text-only communications demanded that the community have a common set of rules. The term "netiquette" has been in use since at least 1983. as evidenced by posts of the satirical "Dear Emily" Post-news column.

e-learning

E-learning is a term that encompasses all forms of Technology-Enhanced Learning (TEL) or very specific types of TEL such as online or Web-based learning. Nevertheless, the term does not have a universally accepted definition and there are divides in the e-learning industry about whether a technology-enhanced system can be called e-learning if there is no set pedagogy as some argue e-learning is: "pedagogy empowered by digital technology".

The term e-learning is ambiguous to those outside the e-learning industry, and even within its diverse disciplines it has different meanings to different people. For instance, in companies it often refers to the strategies that use the company network to deliver training courses to employees and lately in most Universities, e-learning is used to define a specific mode to attend a course or program of study where the students rarely or never meet face-to-face, nor access on-campus educational facilities, because they study online.

the objectives of e-learning are:

1. Improved performance
2. Increased access
3. Convenience and flexibility to learners
4. To develop the skills and competencies needed in the 21st century, in particular to ensure that learners have the digital literacy skills required in their discipline, profession or career

the advantages of e-learning are:

Pay less per credit hour
Reduce overall training time
Spread training out over extended periods of time (even months)
Bookmark progress (computer remembering where the student left off so they can resume the courses from there)
Remain in one location (e.g., home, office, airport, coffee shop, etc.) with no need to travel
Receive quality training that bolsters job performance

Sunday, January 31, 2010

advantages & disadvantages of internet

Advantages and Disadvantages of Internet

Advantages

The Internet provides many facilities to the people. The main advantages of Internet are discussed below.

1. Sharing Information

You can share information with other people around the world. The scientist or researchers can interact with each other to share knowledge and to get guidance etc. Sharing information through Internet is very easy, cheap and fast method.

2. Collection of Information

A lot of information of different types is stored on the web server on the Internet. It means that billions websites contain different information in the form of text and pictures. You can easily collect information on every topic of the world. For this purpose, special websites, called search engines are available on the Internet to search information of every topic of the world. The most popular search engines are altavista.com, search.com, yahoo.com, ask.com etc. The scientists, writers, engineers and many other people use these search engines to collect latest information for different purposes. Usually, the information on the Internet is free of cost. The information on the Internet is available 24 hours a day.

3. News

You can get latest news of the world on the Internet. Most of the newspapers of the world are also available on the Internet. They have their websites from where you can get the latest news about the events happening in the world. These websites are periodically updated or they are immediately updated with latest news when any event happens around the world.

4. Searching Jobs

You can search different types of jobs all over the world, Most of the organizations/departments around the world, advertise their vacant vacancies on the Internet. The search engines are also used to search the jobs on Internet. You can apply for the required job through Internet.

5. Advertisement

Today, most of the commercial organizations advertise their product through Internet. It is very cheap and efficient way for the advertising of products. The products can be presented with attractive and beautiful way to the people around the world.

6. Communication

You can communicate with other through Internet around the world. You can talk by watching to one another; just you are talking with your friends in your drawing room. For this purpose, different services are provided on the Internet such as;

* Chatting
* Video conferencing
* E-mail
* Internet telephony etc.

7. Entertainment

Internet also provides different type of entertainments to the people. You can play games with other people in any part of the world. Similarly, you can see movies, listen music etc. You can also make new friends on the Internet for enjoyment.

8. Online Education

Internet provides the facility to get online education. Many websites of different universities provide lectures and tutorials on different subjects or topics. You can also download these lectures or tutorials into your own computer. You can listen these lectures repeatedly and get a lot of knowledge. It is very cheap and easy way to get education.

9. Online Results

Today, most of the universities and education boards provide results on the Internet. The students can watch their results from any part of country or world.

10. Online Airlines and Railway Schedules

Many Airline companies and Pakistan Railway provide their schedules of flights and trains respectively on the Internet.

11. Online Medical Advice

Many websites are also available on the Internet to get information about different diseases. You can consult a panel of online doctors to get advice about any medical problem. In addition, a lot of material is also available on the Internet for research in medical field.

Disadvantages

Although Internet has many advantages but it also has some disadvantages. The main disadvantages are:

1. Viruses

Today, Internet is the most popular source of spreading viruses. Most of the viruses transfer from one computer to another through e-mail or when information is downloaded on the Internet. These viruses create different problems in your computer. For example, they can affect the performance of your computer and damage valuable data and software stored in your computer.

2. Security Problems

The valuable websites can be damaged by hackers and your valuable data may be deleted. Similarly, confidential data may be accessed by unauthorized persons.

3. Immorality

Some websites contains immoral materials in the form of text, pictures or movies etc. These websites damage the character of new generation.

4. Filtration of Information

When a keyword is given to a search engine to search information of a specific topic, a large number of related links a displayed. In this case, it becomes difficult to filter out the required information.

5. Accuracy of Information

A lot of information about a particular topic is stored on the websites. Some information may be incorrect or not authentic. So, it becomes difficult to select the correct information. Sometimes you may be confused.

6. Wastage of times

A lot of time is wasted to collect the information on the Internet. Some people waste a lot of time in chatting or to play games. At home and offices, most of the people use Internet without any positive purpose.

7. English language problems

Most of the information on the Internet is available in English language. So, some people cannot avail the facility of Internet.

EducAtioN bLog

Tuesday, March 16, 2010

Characterizing Browsing Strategies in the World-Wide Web

History of internet

Asynchronous vs. Synchronous

Tuesday, February 2, 2010

Netiquette

e-learning

Sunday, January 31, 2010

advantages & disadvantages of internet

Facebook Badge

My Blog List

chat box

About Me

About this blog

Followers

Blog Archive