Real-time render of 3dstars (Hipparcos)catalog with proper motion for the upcoming Nightshade 12 release. We begin at a distant orbit about Alcyone in the Pleiades cluster and move outward toward the edge of the galaxy. Then we move back toward Alcyone, never ceasing to orbit that star.
Early render of actual section of night sky from an astronomy visualization project I’m working on. Still lots of tweaking to do but satisfying to get an actual image out of the thing. The data-set is a composition of Hipparcos, Tycho2, and NOMAD star catalog data. This particular scene contained over 6 million stars. Rendering is in real-time at fully interactive frame rates. A larger image is at http://trystan.org/press/wp-content/uploads/2011/07/Screenshot.png.
Ubuntu has been my primary operating system for 3 years running. It has the well deserved reputation of being the most user friendly and ‘desktop’ oriented Linux distribution. However, I recently made the switch to Fedora. I thought I’d add my 2 cents to the barrage of opinions regarding the now heated battle between Ubuntu 11.04, featuring the compiz based Unity desktop, and the Gnome 3 driven Fedora 15.
Why did I switch? After all, I thought Unity was pretty good. It removed the ‘start’ menu concept that has prevailed for the last 2 decades with a search based concept that works quite well. It borrowed the OSX model of focus/context sensitive application menus on the top panel. Good stuff. The bad? Compiz; the sub-system responsible for taking all the 2D GUI stuff and putting it in an OpenGL context so it can be warped, overlayed, blended, etc. Unity is implemented as a Compiz plugin and that’s like building on mud. It’s easy to crash and has a nasty tendency to mess up OpenGL applications. All my OpenSceneGraph apps (including simple demos) stoped worked on the version of compiz released with 11.04. This is nothing new. There has been a history of problems with compiz playing nice. It must be respected for being ‘a first’ for Linux, but it’s not ready for the ‘prime time’ on a development box that must be stable. Additionally, I began to have fairly severe stability problems with the Eclipse IDE (Helios) upon the update to 11.04. It would just randomly crash, usually upon indexing a large C++ project.
Enter Fedora 15 and Gnome 3. Gnome 3 has many of the same features as Unity but does not require Compiz. It uses a completely different OpenGL compositor that, although not as feature rich as Compiz, appears to play nicer with other OpenGL apps and doesn’t totally bork OpenSceneGraph (a must have for my work). As an added bonus, Fedora 15 ships gcc 4.6 whereas Ubuntu 11.04 is still on the 4.5 line. In contrast to Ubuntu 11.04, Eclipse (Helios) has behaved solid thus far.
My takeaway is Fedora is a bit more developer friendly while Ubuntu still may be the better choice for pure desktop users. The base installation of Fedora is sparse by comparison to Ubuntu and users unfamiliar with repositories, package managers and how all that junk ties together may have a harder time getting a fully functional Fedora box together.
Computer science pioneer Edsger Dijkstra published a set of aphorisms that reflected his solid belief that one’s choice of programming language affects the cognitive capabilities of the programmer. The more colorful phrases included, “The use of COBOL cripples the mind; its teaching should, therefore, be regarded as a criminal offense” and “The tools we use have a profound (and devious!) influence on our thinking habits, and, therefore, on our thinking abilities” (Dijkstra 1982). Although never directly referenced by Dijkstra, the comments imply a belief in a “language is thought” hypothesis such as those championed by Benjamin Whorf (Carroll 1997) .
The degree to which spoken language influences thought is intensely debated, with Steven Pinker amongst the most outspoken of Whorfian hypothesis critics. Daniel Casasanto (2008) refutes the popular anti-Whorfian stance proclaimed by Pinker on the grounds that Pinker’s claims are too broad in scope. Casasanto argues that one must distinguish between an Orwellian flavor of the hypothesis, where language and thought are formal equivalences, and linguistic relativism where language and thought simply influence one another. Although Pinker’s assertions may successfully challenge an Orwellian interpretation of Whorf, extending the argument to ‘weaker’ forms of the hypothesis, such as linguistic relativism, is logically fallacious (Casasanto 2008).
The degree to which programming languages in particular influence cognition has received little attention. An informal survey conducted by Richard Wexelblat (1980) revealed that some programmers believe certain languages can interfere with abstract forms of reasoning such as data-structure design and high level engineering activities. Wexelblat expressed his personal concern over the popularity of “do it quick and dirty” languages, such as BASIC, and the generation of students indoctrinated to them. Although interesting, the usefulness of such self-reports is extremely limited. Accordingly, Wexelblat concludes with a call for controlled studies on the subject. Unfortunately, to my knowledge no such study has ever been conducted. Indeed, it’s likely that an experimental approach to such research is not yet sufficiently developed.
A strong version of the Whorfian hypothesis is likely an inappropriate foundation for considering the interaction between the mind of a programmer and a programming language. The language is not the programmer’s thought. Minimally, the language is the cumulative thoughts of many contributors under many transformations. However, a weaker version of the hypothesis may provide a starting point for the study of how programming languages influence the cognition of the programmer. However, significant definition needs to occur prior to applying a Whorfian-like hypothesis to programming languages. Here are some points I’ve identified.
1. Definition and scope of the term ‘language’. Certainly, programming languages are simpler than natural languages. There are closed definitions for the syntax and semantics for programming languages. One may assert that the set of all closed language definitions is the definition of language in the context of programmer-machine interactions. However, this definition is far too narrow as the programmer rarely interacts with the language specification itself. Rather, the interactive elements include emergent properties of the language, such as graphical and syntactical abstractions. Therefore, a useful definition of language in this domain must include all abstractions that carry meaning for the programmer including atomic elements such as keywords up to highly composite elements like user interface components. Developing a clear and agreeable definition will be challenging.
2. Capturing differences in state or state changes. This involves some method of objectively measuring changes in the programmer’s behavior due to changes in the ‘language’; using the broader form of definition above. These could be changes in the structure of the language that influences adaptive behavior on the part of the programmer or changes in the ‘behavior of the language’. Therefore, operationalized metrics must be established that represent an abstract measure of machine/language state and human state.
One could begin with high level programming language qualities such as type system, syntactical style, binding style, etc. and measure how a programmer’s approach to problem solving changes after continued exposure and use of a language that sits at a particular point on this qualitative coordinate system. One method that may be used to examine changes in the programmer’s state, in order to infer changes in problem solving strategy, is examination of their semantic network via lexical decision tasks at different intervals of language exposure time. Such tasks are used to measure word or concept availability in the domain of natural language (McNamara 2005).
If such experiments demonstrated a link between semantic network changes in the programmer that appeared programming language specific then Dijkstra’s intuition that a programming language influences a programmer’s thought processes gains validity. It follows that such influences would form a loop between the programmer and the machine/language forming a cognitive system where the behavior of the programmer influences the language behavior which influences the programmers behavior and so forth. The practical applications of such findings would include renewed appreciation of programming language diversity and help elucidate the relative strengths and weaknesses of language properties with respect to how they combine with the programmer to solve problems.
Carroll J.B. (Ed.). (1997). Language, Thought, and Reality: Selected Writings of Benjamin Lee Whorf. Cambridge, Mass.: MIT Press.
Casasanto D. (2008). Who’s afraid of the big bad Whorf? Crosslingual Differences in Temporal Language and Thought. Language Learning. 58(Suppl. 1), 63-79.
Dijkstra E.W. (1982). Selected Writings on Computing: A Personal Perspective. (pp. 129-131). New York, NY.: Springer-Verlag.
McNamara T.P. (2005). Semantic Priming, perspectives from memory and word recognition. New York, NY.: Taylor & Francis Group.
Wexelblat R.L. (1980). The consequences of one’s first programming language. Proceedings of the 3rd ACM SIGSMALL symposium and the first SIGPC symposium on Small systems. 52-55.
I recently replicated a Shepard and Metzler (1971) style experiment in order to explore OpenGL programming under Matlab. Additionally, I wanted to manipulate certain variables, such as the
type(class) of object, to verify predictions made by emerging models of
top-down processes of human vision such as those mentioned in the previous post. This replication provided a convenient point to start.
In this implementation, the participant is shown a series of 3D object pairs. On some trials, the two objects are different. On other trials, one object is simply a rotation on the xy plane of the other object by a variable multiple of 20 degrees. The participant responds by pressing ’1′ to indicate the same object or ’2′ to indicate different objects while their response time is measured. Previous research shows that response time increases linearly relative to the number of degrees one object is rotated from the other (Shepard, Metzler 1971).
The image to the right below shows the stimuli for one trial. The image to the left depicts my own results for 100 trials. My response time is far less than the typical Shepard (1971) subject but I have years of experience working with 3D graphics and, of course, wrote the code for this experiment. I’ve posted the code as I think it provides a decent example of accessing OpenGL from Matlab and a good starting point for related experiments. The code is available for download here.
Shepard R., Metzler J. (1971). Mental Rotation of Three-Dimensional Objects. Science. 171.3972, 701-703.
One of the toughest problems in the domain of biologically inspired models of vision is explaining how one can properly categorize objects while recognizing novel instances of objects within a category. For example, there are an infinite number of things that you could categorize as a rectangle, yet you have no problems deciding if something is rectangular. The problem becomes fiendish when considering more complex objects such as faces.
It is exceedingly unlikely that visual information is stored in memory for all angles of all objects encountered, as this would require virtually incalculable storage space. Therefore, object recognition is probably not based on rote comparison against stored images. An interesting and demonstrable model described over a series of papers from researchers at MIT proposes that top-down components of object recognition involves a set of 2D prototype objects from which a set of object transforms are learned. These learned transforms can be applied to novel concrete instances from the same object class (Poggio, Vetter 1997).
The best way to illustrate this is by example. I implemented the Poggio and Vetter (1997) model in Matlab and tested it on a set of cuboid prototypes. One can think of these as representing the cube like objects of prior experience. For this implementation, the 3D cuboid vertices were randomly generated. The 3D representations were tilted by a few degrees to avoid an accidental perspective and then projected to 2D space using a perspective projection. The resulting 2D cuboids are rendered in figure 1.
Of course, one rarely observes an object from just one angle, so we’ll add another angle. In reality, there would be many, but not all, possible transformation of the object class. For simplicity, we’ll consider one rotation about the x-axis by 20 degrees. The resulting object class is rendered in figure 2.
Upon this rotation, the model learns how to transform the set of 2D vertices defining the non-rotated object into the 2D vertices defining the rotated objects. In this implementation, the transform is learned by a single layer neural network. However, an analytic solution also exists if the vertices form a linearly independent set; for details see Poggio & Vetter (1997). At this point, we have a single function that will transform any 2D cuboid into the same, or at least very close (if class is not linearly independent), cuboid rotated by 20 degrees.
This can be demonstrated by generating a new random cuboid, one that is not in the prototype class, and using the learned transformation function to rotate the novel cuboid. Figure 3 depicts the random cuboid in red rendered in the same location as the rotated version in blue. This is intended to assist in recognizing a successful transformation.
It is key to recognize that a traditional rotation in 3D space is not being performed on the novel cuboid. The novel object is projected into 2D space and then rotated via the learned 2D->2D transform.
An interesting prediction implied by this model is that when an individual mentally rotates an object, their performance should degrade relative to the number of degrees they are trying to rotate due to repeated applications of learned transforms. For instance, if we wanted to rotate the novel cuboid 40 degrees then we could simply apply the learned transform twice and this should take twice the amount of time. It also follows that the performance should degrade linearly, as the transform is a linear function. In fact, this is exactly what was reported in the classic mental rotation study by Shepard and Metzler (1971). They found a linear increase in reaction time when deciding whether or not two objects observed from different angles were the same object as the degrees of rotation between the two objects increased.
It would be interesting to see if performance holds for rotations of different increments. Shepard and Metzler tested only one increment; all rotations were in increments of 20 degrees. Differences in performance would be expected if individuals stored more than one rotation transform for an object class. For instance, if asked to rotate something by 40 degrees one may apply two iterations of 20 or several iterations of a more granular transform. However, one may be able to jump strait to a 45 or 90 degree rotation, especially for cubes.
Furthermore, experiments investigating the effects of different object classes would provide additional insight. Although Shepard and Metzler argued their objects were novel, they were all composed of cubes. It’s possible that the visual system may be able to apply the cube transforms to the cubes composing the object individually. What would happen if pyramids were used as the atomic component as opposed to cubes? The Poggio and Vetter model examined here couples transforms to an object class. Therefore, differences should be expected when manipulating object class. More specifically, one may not expect performance increases due to learning effects to carry over to a new class.
Poggio T., Vetter T. (1997). Linear Object Classes and Image Synthesis From a Single Example Image. IEEE Transactions on Pattern Matching and Machine Intelligence. 19.7, 733-741.
Shepard R., Metzler J. (1971). Mental Rotation of Three-Dimensional Objects. Science. 171.3972, 701-703.
Dr. Hunter of the University of Colorado wrote a handy implementation of many statistical functions for Common Lisp. The package is called cl-statistics and is available via Unbuntu’s package manager. However, an inverse F cumulative distribution function is mysteriously absent. That is, a function that will provide an F statistic given two degrees of freedom and a percentile. Luckily, it’s pretty strait forward to modify the source to add it. In hopes of saving someone a couple hours of fumbling around, here’s the modification; just add the following function to the cl-statistics source file and add the function name to the export list near the beginning of the file.
(defun f-distribution (dof1 dof2 percentile)
(test-variables (dof1 :posint) (dof2 :posint) (percentile :prob))
#’(lambda (x) (f-significance x dof1 dof2 :one-tailed-p))
(- 1 percentile)))
I have long been interested in how the selection and use of a particular programming language by the programmer influences how they represent the problem internally. Ultimately, this would influence how, or even ‘if’, they solve the problem. As part of my studies at the University of Washington, I’ve finally had the opportunity to do some actual research in this area, albeit of very limited scope. The paper below examines the observed differences in the semantic networks of three programmers. An emphasis is placed on differences between users of languages in different paradigms, such as functional versus imperative.
This interest grew back when I was making heavy use of both the functional and interactive features of the Lua language. Over time, programming began to ‘feel’ different and I notices I was using different types of design patterns as compared to my C++ heavy days. These and many other subjective aspects encouraged me to explore other uniquely variant views of software development such as Smalltalk and Qi. I became further convinced the linguistic homogeny pervasive across industry is detrimental to innovation.
Unfortunately, on the topic of ‘what language is suited for what’ practitioners and researchers alike have little more than anecdotes. After all, most general purpose languages are all Turing complete and therefore computationally equivalent, so why does it matter? It matters when one considers the programmer and the machine as a composite unit. Then, the programmer is involved in a stimulus/response loop with the machine where the programming language and environment is the communication medium. At this level, the way the programmer perceives and responds to the machine state becomes important. I assert that language influences this interaction in an under-appreciated fashion regardless of underlying technical equivalencies.
Semantic Organization of Programming Language Constructs (postscript format)
I began experimenting with neural network code a number of months ago when I decided to implement a network capable of maze navigation. I had read about NNs but I wanted to have the level of understanding that comes with actually implementing such an algorithm. In this post I shall discuss the design of the NN framework, the NN intended to solve mazes, the problems with said design and my thoughts on NNs in general and connectionism. The code was all written in Common Lisp (SBCL) in Eclipse/CUSP on a Linux system.
The maze environment was realized as a 2D array of 0s and 1s where a 1 represented a wall. The mazes were randomly generated using a recursive spatial partitioning algorithm. There are some pretty pictures in the programming category of this blog. The decision to have mazes be the problem domain was somewhat arbitrary and introduced some complexity. In retrospect, finding efficient solutions to mazes was not the best problem to tackle with a simple NN. Although it seems totally doable with the right network design.
The implementation of the NN framework leveraged the Lisp object system allowing for simple extension of the algorithm. The default Neuron object was associated with a set of generic methods Activation, Learning and Propagation. A user wishing to change the Activation function could simply inherit from Neuron and override Activation for their Neuron type. Different types of Neurons and their associated functions could be easily intermixed within the same network without much explicit ‘glue’ code. The implementation also included an object/functional abstraction of a neural layer as used for feedforward networks. The highest level abstraction, feedforward-nn, encapsulated a set of neural layers, an explicit input layer and output layer, and user defined functions that specify the mapping from the problem domain to the input and output layers. At least from my own use, these abstractions facilitated very good flexibility and ease of reuse. They also map back intuitively to typical theoretical specifications of NNs.
I quickly discovered the difficult aspect of NNs is not the implementation of an NN framework but rather the design of a network to solve a particular problem. Initially, I considered ‘evolving’ a network using genetic algorithms. Indeed, this would have been neat especially if it worked. However, I felt that would divert my focus too much and require even more time. I settled on an intuitive design that included 9 input neurons and 8 output neurons; there were no intermediate layers. The neurons were wired in feed-forward fashion with this exception of the output layer feeding back into the input layer (more on this in a moment). The first 8 neurons of the input layer represented a ‘touch sense’ where the neuron was set to fire if a wall was present in that direction (8 possible directions for a 2D cellular environment). The output layer represented the direction to move where the sum of the output neuron’s activation levels was [0, 8]. The 9th neuron on the input layer was a ‘smell sense’ which fired if the agent was closer to the cheese/exit than it was on the previous pass. Now, in order to potentiate a ‘correct’ move, a move closer to the cheese, the agent must ‘remember’ the firing sequence of the output neurons from the last pass; hence the aforementioned feedback from the output layer to the input layer.
Unfortunately, this critter never does get to the cheese. Typically after a few moves it settles into a corner and never departs. After a direction is established, the critter becomes more and more determined to keep going that direction. To be successful, there must be some notion of de-potentiation after continued lack of stimulus on neuron 9. That way, it would eventually try another direction even when the input stimulus remains constant. Therefore, I believe some additional tweaking could produce a network capable of solving the mazes.
Regardless, I feel I’ve accomplished what I set out for which was to obtain a fundamental understanding of NNs, their potential and their limitations. Of particular interest to me when I started the project was to evolve my own opinion regarding the biological validity of NNs. I believe a major shortcoming of the current NN definitions, in the scope of biological simulation, is the total reliance on changes in ‘physical’ representation to map to changes in behavior. Given identical input to an NN you will have identical output until the weights, part of the physical representation, are altered. However, I’m skeptical that changes in biological behavior can be modeled purely from these representational changes. There are meta-representational aspects that must be considered such as firing frequency. More simply, you have the wire but you also have what is transmitted ‘over’ the wire. NNs appear to focus on changing the wires to produce changes in behavior. Indeed, it is known that long term potentiation does exist and it likely does produce changes in behavior. However, I assert that is likely not the only mode under which behavioral changes occur and NNs do not currently encompass any mode besides LTP. Rhetorically, how would one describe the behavior of working memory using a model dependent on LTP? To be clear, this is not to say that the necessary behavior can not ‘emerge’ from an NN. I’m merely stating that the current formalism, in of itself, doesn’t seem to allow for such behavior.
Another concern regarding the use of NNs to understand biological function is the potential complexity of the NN resulting from an attempt to model an abstract biological function such as the meaning of an object (I am assuming that the mind is ultimately reducible to the physical or computational realm). As I eluded in the previous paragraph, I assert that abstract concepts are meta-representational, or that one must look beyond a simple neural correlate snapshot to understand how the more complex areas of the mind function. If this is true, then a model at the level of an NN may prove to be as much as a mystery as the physical reality it was intended to model and resist proofs of its correctness.
In the end, NNs will probably continue to prove valuable to those researching at the granular level, neuroscientists, biopsychologists, etc. There is also the professional domain that can apply them to interesting problems regardless of their biological validity. However, I’m skeptical regarding the applicability to the investigation of abstract constructs such as the meaning of objects and reasoning.
BBC news reports the AI agent at http://www.elbot.com/ convinced 3 out of 12 human investigators that it was “indistinguishable from them.” In the experimental context, the participants don’t know if they’re text chatting with a human or the AI agent. If enough people exposed to the agent are tricked into believing they’ve been chatting with a human then the agent is said to have “passed the turing test.” This particular agent doesn’t appear to make any significant advances beyond others I’ve toyed with; they are really still toys.
Although I haven’t personally dug into the code for this critter, or similar programs, it seems likely that it uses traditional “Chomsky” grammar analysis techniques which would explain the apparent mastery of lexical structure but very limited power over semantics. It is very easy to design questions that expose the agent’s lack of any grasp of meaning. In fact, when asked “What is 2 minus 2?” it replied “…1.” When asked very abstract questions such as “What is the difference between objectivity and subjectivity?” It would typically respond with an attempt at humor that had no contextual connection at all. A more convincing response would simply be “I don’t know.” At least this would leave the possibility open that it was a young person or someone who never studied science or philosophy. Unfortunately, after only a few questions one is left with two possibilities. You’re talking to a program or an idiot that just happens to have perfect lexical structure, flawless spelling and never makes typos.
In the end, I’m not very surprised at the responses of the agent. I’m more surprised that 3 people actually deemed it “indistinguishable from a human respondent.” Although, I’m not exactly sure how that’s defined. I’m skeptical that a generative grammar type approach alone will ever produce a passable “conversation agent.” Regardless of whether language is emergent or nativist, there needs to be additional formal abstractions built to encapsulate some notion of understanding. For the nativist camp, I think this means building something atop formal grammars. For the emergent camp, this may mean realizing generative grammars within a connectionist context, such as a neural network. Having a layer underneath the grammar may provide insight on how to evolve the notion of “understanding.”