diff --git a/README.md b/README.md
index cbc1dce..81c39e3 100755
--- a/README.md
+++ b/README.md
@@ -3,113 +3,32 @@ CIS565: Project 2: CUDA Pathtracer
 -------------------------------------------------------------------------------
 Fall 2012
 -------------------------------------------------------------------------------
-Due Friday, 10/12/2012
+Kong Ma
 -------------------------------------------------------------------------------
-
--------------------------------------------------------------------------------
-NOTE:
--------------------------------------------------------------------------------
-This project requires an NVIDIA graphics card with CUDA capability! Any card after the Geforce 8xxx series will work. If you do not have an NVIDIA graphics card in the machine you are working on, feel free to use any machine in the SIG Lab or in Moore100 labs. All machines in the SIG Lab and Moore100 are equipped with CUDA capable NVIDIA graphics cards. If this too proves to be a problem, please contact Patrick or Karl as soon as possible.
-
 -------------------------------------------------------------------------------
-INTRODUCTION:
--------------------------------------------------------------------------------
-In this project, you will extend your raytracer from Project 1 into a full CUDA based global illumination pathtracer. 
-
-For this project, you may either choose to continue working off of your codebase from Project 1, or you may choose to use the included basecode in this repository. The basecode for Project 2 is the same as the basecode for Project 1, but with some missing components you will need filled in, such as the intersection testing and camera raycasting methods. 
-
-How you choose to extend your raytracer into a pathtracer is a fairly open-ended problem; the supplied basecode is meant to serve as one possible set of guidelines for doing so, but you may choose any approach you want in your actual implementation, including completely scrapping the provided basecode in favor of your own from-scratch solution.
-
+Implemented Feature
 -------------------------------------------------------------------------------
-CONTENTS:
--------------------------------------------------------------------------------
-The Project2 root directory contains the following subdirectories:
-	
-* src/ contains the source code for the project. Both the Windows Visual Studio solution and the OSX makefile reference this folder for all source; the base source code compiles on OSX and Windows without modification.
-* scenes/ contains an example scene description file.
-* renders/ contains two example renders: the raytraced render from Project 1 (GI_no.bmp), and the same scene rendered with global illumination (GI_yes.bmp). 
-* PROJ1_WIN/ contains a Windows Visual Studio 2010 project and all dependencies needed for building and running on Windows 7.
-* PROJ1_OSX/ contains a OSX makefile, run script, and all dependencies needed for building and running on Mac OSX 10.8. 
-
-The Windows and OSX versions of the project build and run exactly the same way as in Project0 and Project1.
-
--------------------------------------------------------------------------------
-REQUIREMENTS:
--------------------------------------------------------------------------------
-In this project, you are given code for:
-
-* All of the basecode from Project 1, plus:
-* Intersection testing code for spheres and cubes
-* Code for raycasting from the camera
-
-You will need to implement the following features. A number of these required features you may have already implemented in Project 1. If you have, you are ahead of the curve and have less work to do! 
-
-* Full global illumination (including soft shadows, color bleeding, etc.) by pathtracing rays through the scene. 
+Finished all the basic features:
+* Full global illumination 
 * Properly accumulating emittance and colors to generate a final image
 * Supersampled antialiasing
-* Parallelization by ray instead of by pixel via string compaction (see the Physically-based shading and pathtracing lecture slides from 09/24 if you don't know what this refers to)
+* Parallelization by ray instead of by pixel via string compaction 
 * Perfect specular reflection
 
-You are also required to implement at least two of the following features. Some of these features you may have already implemented in Project 1. If you have, you may NOT resubmit those features and instead must pick two new ones to implement.
-
-* Additional BRDF models, such as Cook-Torrance, Ward, etc. Each BRDF model may count as a separate feature. 
-* Texture mapping 
-* Bump mapping
-* Translational motion blur
-* Fresnel-based Refraction, i.e. glass
-* OBJ Mesh loading and rendering without KD-Tree
+Finished following optional features
 * Interactive camera
-* Integrate an existing stackless KD-Tree library, such as CUKD (https://github.com/unvirtual/cukd)
-* Depth of field
-
-Alternatively, implementing just one of the following features can satisfy the "pick two" feature requirement, since these are correspondingly more difficult problems:
-
-* Physically based subsurface scattering and transmission
-* Implement and integrate your own stackless KD-Tree from scratch. 
-* Displacement mapping
-* Deformational motion blur
-
-As yet another alternative, if you have a feature or features you really want to implement that are not on this list, let us know, and we'll probably say yes!
+* Implemented inclusive scan for string compaction
 
 -------------------------------------------------------------------------------
-NOTES ON GLM:
+BLOG LINK:
 -------------------------------------------------------------------------------
-This project uses GLM, the GL Math library, for linear algebra. You need to know two important points on how GLM is used in this project:
-
-* In this project, indices in GLM vectors (such as vec3, vec4), are accessed via swizzling. So, instead of v[0], v.x is used, and instead of v[1], v.y is used, and so on and so forth.
-* GLM Matrix operations work fine on NVIDIA Fermi cards and later, but pre-Fermi cards do not play nice with GLM matrices. As such, in this project, GLM matrices are replaced with a custom matrix struct, called a cudaMat4, found in cudaMat4.h. A custom function for multiplying glm::vec4s and cudaMat4s is provided as multiplyMV() in intersections.h.
-
+http://gpuprojects.blogspot.com/2012/10/path-tracer.html
 -------------------------------------------------------------------------------
-BLOG
+Instruction on buiding and running
 -------------------------------------------------------------------------------
-As mentioned in class, all students should have student blogs detailing progress on projects. If you already have a blog, you can use it; otherwise, please create a blog using www.blogger.com or any other tool, such as www.wordpress.org. Blog posts on your project are due on the SAME DAY as the project, and should include:
+Key board and mouse control
 
-* A brief description of the project and the specific features you implemented.
-* A link to your github repo if the code is open source.
-* At least one screenshot of your project running.
-* A 30 second or longer video of your project running.  To create the video use http://www.microsoft.com/expression/products/Encoder4_Overview.aspx 
+1. camera movement  w,s,a,d  Or  mouse left click+mouse move
 
--------------------------------------------------------------------------------
-THIRD PARTY CODE POLICY
--------------------------------------------------------------------------------
-* Use of any third-party code must be approved by asking on Piazza.  If it is approved, all students are welcome to use it.  Generally, we approve use of third-party code that is not a core part of the project.  For example, for the ray tracer, we would approve using a third-party library for loading models, but would not approve copying and pasting a CUDA function for doing refraction.
-* Third-party code must be credited in README.md.
-* Using third-party code without its approval, including using another student's code, is an academic integrity violation, and will result in you receiving an F for the semester.
-
--------------------------------------------------------------------------------
-SELF-GRADING
--------------------------------------------------------------------------------
-* On the submission date, email your grade, on a scale of 0 to 100, to Karl, yiningli@seas.upenn.edu, with a one paragraph explanation.  Be concise and realistic.  Recall that we reserve 30 points as a sanity check to adjust your grade.  Your actual grade will be (0.7 * your grade) + (0.3 * our grade).  We hope to only use this in extreme cases when your grade does not realistically reflect your work - it is either too high or too low.  In most cases, we plan to give you the exact grade you suggest.
-* Projects are not weighted evenly, e.g., Project 0 doesn't count as much as the path tracer.  We will determine the weighting at the end of the semester based on the size of each project.
-
--------------------------------------------------------------------------------
-SUBMISSION
--------------------------------------------------------------------------------
-As with the previous project, you should fork this project and work inside of your fork. Upon completion, commit your finished project back to your fork, and make a pull request to the master repository.
-You should include a README.md file in the root directory detailing the following
+2. camera zooms  q,e  Or  mouse right click+mouse move
 
-* A brief description of the project and specific features you implemented
-* At least one screenshot of your project running, and at least one screenshot of the final rendered output of your pathtracer
-* Instructions for building and running your project if they differ from the base code
-* A link to your blog post detailing the project
-* A list of all third-party code used
\ No newline at end of file
diff --git a/renders/refra 2.6.PNG b/renders/refra 2.6.PNG
new file mode 100644
index 0000000..532c791
Binary files /dev/null and b/renders/refra 2.6.PNG differ
diff --git a/scenes/refraScene.txt b/scenes/refraScene.txt
new file mode 100644
index 0000000..97fd261
--- /dev/null
+++ b/scenes/refraScene.txt
@@ -0,0 +1,241 @@
+MATERIAL 0				//white diffuse
+RGB         1 1 1       
+SPECEX      0      
+SPECRGB     1 1 1      
+REFL        0       
+REFR        0        
+REFRIOR     0       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 1 				//red diffuse
+RGB         .63 .06 .04       
+SPECEX      0      
+SPECRGB     1 1 1      
+REFL        0       
+REFR        0        
+REFRIOR     0       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 2 				//green diffuse
+RGB         .15 .48 .09      
+SPECEX      0      
+SPECRGB     1 1 1      
+REFL        0       
+REFR        0        
+REFRIOR     0       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 3 				//red glossy
+RGB         .63 .06 .04      
+SPECEX      0      
+SPECRGB     1 1 1       
+REFL        0       
+REFR        0        
+REFRIOR     2       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 4 				//white glossy
+RGB         1 1 1     
+SPECEX      0      
+SPECRGB     1 1 1      
+REFL        1       
+REFR        0        
+REFRIOR     2      
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 5 				//glass
+RGB         1 1 1    
+SPECEX      0      
+SPECRGB     1 1 1      
+REFL        0       
+REFR        1        
+REFRIOR     2.2       
+SCATTER     0        
+ABSCOEFF    .02 5.1 5.7      
+RSCTCOEFF   13
+EMITTANCE   0
+
+MATERIAL 6 				//green glossy
+RGB         .15 .48 .09      
+SPECEX      0      
+SPECRGB     1 1 1     
+REFL        0       
+REFR        0        
+REFRIOR     2.6       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 7				//light
+RGB         1 1 1       
+SPECEX      0      
+SPECRGB     0 0 0       
+REFL        0       
+REFR        0        
+REFRIOR     0       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   1
+
+MATERIAL 8				//light
+RGB         1 1 1       
+SPECEX      0      
+SPECRGB     0 0 0       
+REFL        0       
+REFR        0        
+REFRIOR     0       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   30
+
+CAMERA
+RES         800 800
+FOVY        25
+ITERATIONS  5000
+FILE        renders/sampleScene.bmp
+frame 0
+EYE         0 4.5 12
+VIEW        0 0 -1
+UP          0 1 0
+frame 1
+EYE         0 4.5 12
+VIEW        0 0 -1
+UP          0 1 0
+
+OBJECT 0
+cube
+material 0
+frame 0
+TRANS       0 0 0
+ROTAT       0 0 90
+SCALE       .01 10 10 
+frame 1
+TRANS       0 0 0
+ROTAT       0 0 90
+SCALE       .01 10 10 
+
+OBJECT 1
+cube
+material 0
+frame 0
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 10 
+frame 1
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 10 
+
+OBJECT 2
+cube
+material 0
+frame 0
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 10 10
+frame 1
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 10 10
+
+OBJECT 3
+cube
+material 1
+frame 0
+TRANS       -5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+frame 1
+TRANS       -5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+OBJECT 4
+cube
+material 2
+frame 0
+TRANS       5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+frame 1
+TRANS       5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+OBJECT 5
+sphere
+material 4
+frame 0
+TRANS       0 2 0
+ROTAT       0 180 0
+SCALE       3 3 3
+frame 1
+TRANS       0 2 0
+ROTAT       0 180 0
+SCALE       3 3 3
+
+OBJECT 6
+sphere
+material 3
+frame 0
+TRANS       2 5 2
+ROTAT       0 180 0
+SCALE       2.5 2.5 2.5
+frame 1
+TRANS       2 5 2
+ROTAT       0 180 0
+SCALE       2.5 2.5 2.5
+
+OBJECT 7
+sphere
+material 6
+frame 0
+TRANS       -2 5 -2
+ROTAT       0 180 0
+SCALE       3 3 3
+frame 1
+TRANS       -2 5 -2
+ROTAT       0 180 0
+SCALE       3 3 3
+
+OBJECT 8
+cube
+material 5
+frame 0
+TRANS       0 3 2
+ROTAT       0 0 0
+SCALE       3 3 .3
+frame 1
+TRANS       0 2 3
+ROTAT       0 0 0
+SCALE       3 3 .3
+
+OBJECT 9
+cube
+material 8 
+frame 0
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .3 3 3
+frame 1
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .3 3 3
\ No newline at end of file
diff --git a/src/glm/core/func_common.hpp b/src/glm/core/func_common.hpp
index 4a696e3..a9d6acb 100755
--- a/src/glm/core/func_common.hpp
+++ b/src/glm/core/func_common.hpp
@@ -278,8 +278,8 @@ namespace glm
 	/// you would want a threshold function with a smooth
 	/// transition. This is equivalent to:
 	/// genType t;
-	/// t = clamp ((x – edge0) / (edge1 – edge0), 0, 1);
-	/// return t * t * (3 – 2 * t);
+	/// t = clamp ((x ?edge0) / (edge1 ?edge0), 0, 1);
+	/// return t * t * (3 ?2 * t);
 	/// Results are undefined if edge0 >= edge1.
 	///
 	/// @tparam genType Floating-point scalar or vector types.
diff --git a/src/glm/core/func_common.inl b/src/glm/core/func_common.inl
index 4894410..5bb8005 100755
--- a/src/glm/core/func_common.inl
+++ b/src/glm/core/func_common.inl
@@ -277,7 +277,7 @@ namespace detail
 	//// Only valid if (INT_MIN <= x-y <= INT_MAX)
 	//// min(x,y)
 	//r = y + ((x - y) & ((x - y) >> (sizeof(int) *
-	//CHAR_BIT – 1)));
+	//CHAR_BIT ?1)));
 	//// max(x,y)
 	//r = x - ((x - y) & ((x - y) >> (sizeof(int) *
 	//CHAR_BIT - 1)));
diff --git a/src/interactions.h b/src/interactions.h
index e18cfff..f94d266 100755
--- a/src/interactions.h
+++ b/src/interactions.h
@@ -24,7 +24,7 @@ __host__ __device__ glm::vec3 getRandomDirectionInSphere(float xi1, float xi2);
 __host__ __device__ glm::vec3 calculateTransmission(glm::vec3 absorptionCoefficient, float distance);
 __host__ __device__ glm::vec3 calculateTransmissionDirection(glm::vec3 normal, glm::vec3 incident, float incidentIOR, float transmittedIOR);
 __host__ __device__ glm::vec3 calculateReflectionDirection(glm::vec3 normal, glm::vec3 incident);
-__host__ __device__ Fresnel calculateFresnel(glm::vec3 normal, glm::vec3 incident, float incidentIOR, float transmittedIOR, glm::vec3 reflectionDirection, glm::vec3 transmissionDirection);
+__host__ __device__ Fresnel calculateFresnel(glm::vec3 normal, glm::vec3 incident, float incidentIOR, float transmittedIOR, glm::vec3 &reflectionDirection, glm::vec3 &transmissionDirection);
 __host__ __device__ glm::vec3 calculateRandomDirectionInHemisphere(glm::vec3 normal, float xi1, float xi2);
 
 //TODO (OPTIONAL): IMPLEMENT THIS FUNCTION
@@ -46,22 +46,46 @@ __host__ __device__ glm::vec3 calculateTransmissionDirection(glm::vec3 normal, g
 //TODO (OPTIONAL): IMPLEMENT THIS FUNCTION
 __host__ __device__ glm::vec3 calculateReflectionDirection(glm::vec3 normal, glm::vec3 incident) {
   //nothing fancy here
-  return glm::vec3(0,0,0);
+  //return glm::vec3(0,0,0);
+	return glm::normalize(incident-2.0f*normal*glm::dot(normal,incident)); 
 }
 
 //TODO (OPTIONAL): IMPLEMENT THIS FUNCTION
-__host__ __device__ Fresnel calculateFresnel(glm::vec3 normal, glm::vec3 incident, float incidentIOR, float transmittedIOR, glm::vec3 reflectionDirection, glm::vec3 transmissionDirection) {
-  Fresnel fresnel;
+__host__ __device__ Fresnel calculateFresnel(glm::vec3 normal, glm::vec3 incident, float incidentIOR, float transmittedIOR, glm::vec3 &reflectionDirection, glm::vec3 &transmissionDirection) {
+    Fresnel fresnel;
+  float incidentAngle=acos(glm::dot(normal,-incident));
+  
+  if(incidentIOR>transmittedIOR){
+		float CriticalAngle=asin(transmittedIOR/incidentIOR);
+		if(incidentAngle>CriticalAngle){//pure reflection
+		fresnel.reflectionCoefficient = 1.0f; 
+		fresnel.transmissionCoefficient =0.0f;
+		reflectionDirection=calculateReflectionDirection(normal,incident);
+		return fresnel;
+	   }
+  }
+ 
+	float TransmittedAngle=asin(sin(incidentAngle)*incidentIOR/transmittedIOR);
+	float Rs=max(pow((incidentIOR*cos(incidentAngle)-transmittedIOR*cos(TransmittedAngle))/(incidentIOR*cos(incidentAngle)+transmittedIOR*cos(TransmittedAngle)),2.0f),0.0f);
+	float Rt=max(pow((incidentIOR*cos(TransmittedAngle)-transmittedIOR*cos(incidentAngle))/(incidentIOR*cos(TransmittedAngle)+transmittedIOR*cos(incidentAngle)),2.0f),0.0f);
+	fresnel.reflectionCoefficient =(Rs+Rt)/2;
+	fresnel.transmissionCoefficient = 1-fresnel.reflectionCoefficient;
+	reflectionDirection=calculateReflectionDirection(normal,incident);
+	/*glm::vec3 crs=glm::normalize(glm::cross(incident,normal));
+	glm::vec3 axs=glm::normalize(glm::cross(normal,crs));
+	transmissionDirection=glm::normalize(tan(TransmittedAngle)*axs-normal);*/
+	float n12=incidentIOR/transmittedIOR;
+	float ni=glm::dot(normal,incident);
+	float squareValue=1.0f-n12*n12*(1.0f-ni*ni);
+	transmissionDirection=glm::normalize((-n12*ni-sqrt(squareValue))*normal+n12*incident);
 
-  fresnel.reflectionCoefficient = 1;
-  fresnel.transmissionCoefficient = 0;
   return fresnel;
 }
 
 //LOOK: This function demonstrates cosine weighted random direction generation in a sphere!
 __host__ __device__ glm::vec3 calculateRandomDirectionInHemisphere(glm::vec3 normal, float xi1, float xi2) {
     
-    //crucial difference between this and calculateRandomDirectionInSphere: THIS IS COSINE WEIGHTED!
+   // crucial difference between this and calculateRandomDirectionInSphere: THIS IS COSINE WEIGHTED!
     
     float up = sqrt(xi1); // cos(theta)
     float over = sqrt(1 - up * up); // sin(theta)
@@ -90,14 +114,111 @@ __host__ __device__ glm::vec3 calculateRandomDirectionInHemisphere(glm::vec3 nor
 //Now that you know how cosine weighted direction generation works, try implementing non-cosine (uniform) weighted random direction generation. 
 //This should be much easier than if you had to implement calculateRandomDirectionInHemisphere.
 __host__ __device__ glm::vec3 getRandomDirectionInSphere(float xi1, float xi2) {
-  return glm::vec3(0,0,0);
+   
+  float x=TWO_PI*cos(TWO_PI*xi1)*sqrt(xi2*(1-xi2));
+  float y=TWO_PI*sin(TWO_PI*xi1)*sqrt(xi2*(1-xi2));
+  float z=PI*(1-xi2);
+  return glm::vec3(x,y,z);
 }
 
 //TODO (PARTIALLY OPTIONAL): IMPLEMENT THIS FUNCTION
 //returns 0 if diffuse scatter, 1 if reflected, 2 if transmitted.
-__host__ __device__ int calculateBSDF(ray& r, glm::vec3 intersect, glm::vec3 normal, glm::vec3 emittedColor, 
+__host__ __device__ int calculateBSDF(glm::vec3 eye,ray& r, glm::vec3 intersect, glm::vec3 normal, glm::vec3 emittedColor, 
                                        AbsorptionAndScatteringProperties& currentAbsorptionAndScattering, 
-                                       glm::vec3& color, glm::vec3& unabsorbedColor, material m){
+                                       glm::vec3& color, glm::vec3& unabsorbedColor, material m,float time, rayData &newrayData){
+
+   thrust::default_random_engine rng(hash(time));
+    thrust::uniform_real_distribution<float> u01(0.0f,1.0f);
+ 
+	glm::vec3 reflectedDirection,transmittedDirection;
+	Fresnel fr;
+
+	//difffuse surface
+  if((!m.hasReflective)&&(!m.hasRefractive))
+  {
+	  newrayData.newray.origin=intersect+0.001f*normal;
+	  newrayData.newray.direction=glm::normalize(calculateRandomDirectionInHemisphere(normal,u01(rng),u01(rng)));
+	  color=m.color;
+	  if(m.specularExponent>0.0f){
+		  glm::vec3 viewDir=glm::normalize(eye-intersect);
+		  glm::vec3 lightReflection=calculateReflectionDirection(normal,-newrayData.newray.direction);
+		  float specularTerm=pow(max(glm::dot(viewDir,lightReflection),0.0f),m.specularExponent);
+		  color+=specularTerm*m.specularColor;
+	  }
+
+	 // if(m.indexOfRefraction>0.0f){
+	//   fr=calculateFresnel(normal,r.direction,1.0f,m.indexOfRefraction,reflectedDirection,transmittedDirection);	
+	//glm::vec3 viewDirection=glm::normalize(eye-intersect);
+	//glm::vec3 HalfAngleVector=glm::normalize(newrayData.newray.direction+glm::normalize(viewDirection));
+	//
+	//float vn=(glm::dot(viewDirection,normal));
+	//float hn=(glm::dot(HalfAngleVector,normal));
+	//float lh=(glm::dot(newrayData.newray.direction,HalfAngleVector));
+	//float vh=(glm::dot(viewDirection,HalfAngleVector));
+	//float ln=(glm::dot(newrayData.newray.direction,normal));
+	//	//Beckmann distribution
+	//float roughness=0.05f;
+	//float tan2alpha=(1.0f/hn)/hn-1.0f;
+	//float D=exp(-tan2alpha/roughness/roughness)/(roughness*hn*roughness*hn*PI*hn*hn);
+ //   //GAUSSIAN;
+	////  float alpha = acos(hn);
+	////D=exp(-(alpha/roughness/roughness);
+	////float F=fr.reflectionCoefficient;
+	////schlick's approximation
+	////fr=calculateFresnel(normal,normal,1.0f,m.indexOfRefraction,reflectedDirection,transmittedDirection);	
+	////float n1=m.indexOfRefraction;
+	////float rani=pow((float)(1.0f-n1)/(1.0f+n1),2.0f);
+	////float F = rani+(1-rani)*pow( 1.0f - vh, 5.0f );
+	//float F=fr.reflectionCoefficient;
+	//float G=min(1.0f,2*hn/vh*min(vn,ln));
+	//float specularTerm= max(D/4.0*F/vn*G/ln,0.0);
+	//color+=specularTerm*m.specularColor;
+	//  }
+	  return 0;
+  }    
+		if(glm::dot(normal,r.direction)<0.001)
+		fr=calculateFresnel(normal,r.direction,1.0f,m.indexOfRefraction,reflectedDirection,transmittedDirection);	
+		else 
+		fr=calculateFresnel(-normal,r.direction,m.indexOfRefraction,1.0f,reflectedDirection,transmittedDirection);	
+ 
+  bool reflectiveRay=false,refractiveRay=false;
+  if((m.hasReflective)&&(m.hasRefractive)){
+	  if(abs(fr.transmissionCoefficient)<0.1){
+		  reflectiveRay=true;
+	  }else{
+		 float russianRoulette = (float)u01(rng);
+		if(russianRoulette<0.5)
+		 reflectiveRay=true;
+		else
+		 refractiveRay=true;
+	  }
+	 }else if(m.hasReflective){
+		 reflectiveRay=true;
+	 }else if(m.hasRefractive){
+		 refractiveRay=true;
+	 }
+
+
+  if(reflectiveRay){
+	  newrayData.newray.origin=intersect+0.001f*normal;
+	  newrayData.newray.direction=reflectedDirection;
+	  newrayData.coeff=fr.reflectionCoefficient;
+	  color=m.color*fr.reflectionCoefficient;
+	    if(m.indexOfRefraction>0.0f){
+		  glm::vec3 viewDir=glm::normalize(eye-intersect);
+		  glm::vec3 lightReflection=calculateReflectionDirection(normal,-newrayData.newray.direction);
+		  float specularTerm=pow(max(glm::dot(viewDir,lightReflection),0.0f),m.specularExponent);
+		  color+=specularTerm*m.specularColor;
+	  }
+	  return 1;
+  }
+  if(refractiveRay){
+	  newrayData.newray.origin=intersect+0.001f*normal;
+	  newrayData.newray.direction=transmittedDirection;
+	  newrayData.coeff=fr.transmissionCoefficient;
+	  color=m.color*fr.transmissionCoefficient;
+    return 2;
+  }
 
   return 1;
 };
diff --git a/src/intersections.h b/src/intersections.h
index 714e918..123bc2d 100755
--- a/src/intersections.h
+++ b/src/intersections.h
@@ -18,7 +18,6 @@ __host__ __device__ glm::vec3 multiplyMV(cudaMat4 m, glm::vec4 v);
 __host__ __device__ glm::vec3 getSignOfRay(ray r);
 __host__ __device__ glm::vec3 getInverseDirectionOfRay(ray r);
 __host__ __device__ float boxIntersectionTest(staticGeom sphere, ray r, glm::vec3& intersectionPoint, glm::vec3& normal);
-__host__ __device__ float boxIntersectionTest(glm::vec3 boxMin, glm::vec3 boxMax, staticGeom box, ray r, glm::vec3& intersectionPoint, glm::vec3& normal);
 __host__ __device__ float sphereIntersectionTest(staticGeom sphere, ray r, glm::vec3& intersectionPoint, glm::vec3& normal);
 __host__ __device__ glm::vec3 getRandomPointOnCube(staticGeom cube, float randomSeed);
 
@@ -69,98 +68,159 @@ __host__ __device__ glm::vec3 getSignOfRay(ray r){
   return glm::vec3((int)(inv_direction.x < 0), (int)(inv_direction.y < 0), (int)(inv_direction.z < 0));
 }
 
-//Wrapper for cube intersection test for testing against unit cubes
-__host__ __device__  float boxIntersectionTest(staticGeom box, ray r, glm::vec3& intersectionPoint, glm::vec3& normal){
-  return boxIntersectionTest(glm::vec3(-.5,-.5,-.5), glm::vec3(.5,.5,.5), box, r, intersectionPoint, normal);
-}
-
+//TODO: IMPLEMENT THIS FUNCTION
 //Cube intersection test, return -1 if no intersection, otherwise, distance to intersection
-__host__ __device__  float boxIntersectionTest(glm::vec3 boxMin, glm::vec3 boxMax, staticGeom box, ray r, glm::vec3& intersectionPoint, glm::vec3& normal){
-    glm::vec3 currentNormal = glm::vec3(0,0,0);
-
-    ray ro = r;
-
-    glm::vec3 iP0 = multiplyMV(box.inverseTransform,glm::vec4(r.origin, 1.0f));
-    glm::vec3 iP1 = multiplyMV(box.inverseTransform,glm::vec4(r.origin+r.direction, 1.0f));
-    glm::vec3 iV0 = iP1 - iP0;
-
-    r.origin = iP0; 
-    r.direction = glm::normalize(iV0);
-
-    float tmin, tmax, tymin, tymax, tzmin, tzmax;
-
-    glm::vec3 rsign = getSignOfRay(r);
-    glm::vec3 rInverseDirection = getInverseDirectionOfRay(r);
-
-    if((int)rsign.x==0){
-      tmin = (boxMin.x - r.origin.x) * rInverseDirection.x;
-      tmax = (boxMax.x - r.origin.x) * rInverseDirection.x;
-    }else{
-      tmin = (boxMax.x - r.origin.x) * rInverseDirection.x;
-      tmax = (boxMin.x - r.origin.x) * rInverseDirection.x;
-    }
-
-    if((int)rsign.y==0){
-      tymin = (boxMin.y - r.origin.y) * rInverseDirection.y;
-      tymax = (boxMax.y - r.origin.y) * rInverseDirection.y;
-    }else{
-      tymin = (boxMax.y - r.origin.y) * rInverseDirection.y;
-      tymax = (boxMin.y - r.origin.y) * rInverseDirection.y;
-    }
-
-    if ( (tmin > tymax) || (tymin > tmax) ){
-        return -1;
-    }
-    if (tymin > tmin){
-        tmin = tymin;
-    }
-    if (tymax < tmax){
-        tmax = tymax;
-    }
-
-    if((int)rsign.z==0){
-      tzmin = (boxMin.z - r.origin.z) * rInverseDirection.z;
-      tzmax = (boxMax.z - r.origin.z) * rInverseDirection.z;
-    }else{
-      tzmin = (boxMax.z - r.origin.z) * rInverseDirection.z;
-      tzmax = (boxMin.z - r.origin.z) * rInverseDirection.z;
-    }
-
-    if ( (tmin > tzmax) || (tzmin > tmax) ){
-        return -1;
-    }
-    if (tzmin > tmin){
-        tmin = tzmin;
-    }
-    if (tzmax < tmax){
-        tmax = tzmax;
-    }
-    if(tmin<0){
-        return -1;
-    }
+__host__ __device__  float boxIntersectionTest(staticGeom box, ray r, glm::vec3& intersectionPoint, glm::vec3& normal){
+  //transform from world coords to object coords
+  glm::vec3 ro = multiplyMV(box.inverseTransform, glm::vec4(r.origin,1.0f));
+  glm::vec3 rd = glm::normalize(multiplyMV(box.inverseTransform, glm::vec4(r.direction,0.0f)));
 
-    glm::vec3 osintersect = r.origin + tmin*r.direction;
-
-    if(abs(osintersect.x-abs(boxMax.x))<.001){
-        currentNormal = glm::vec3(1,0,0);
-    }else if(abs(osintersect.y-abs(boxMax.y))<.001){
-        currentNormal = glm::vec3(0,1,0);
-    }else if(abs(osintersect.z-abs(boxMax.z))<.001){
-        currentNormal = glm::vec3(0,0,1);
-    }else if(abs(osintersect.x+abs(boxMin.x))<.001){
-        currentNormal = glm::vec3(-1,0,0);
-    }else if(abs(osintersect.y+abs(boxMin.y))<.001){
-        currentNormal = glm::vec3(0,-1,0);
-    }else if(abs(osintersect.z+abs(boxMin.z))<.001){
-        currentNormal = glm::vec3(0,0,-1);
-    }
+  ray rt; rt.origin = ro; rt.direction = rd;
 
-    intersectionPoint = multiplyMV(box.transform, glm::vec4(osintersect, 1.0));
+  glm::vec3 localCenter=glm::vec3(0,0,0);
+
+     //calculate the intersection time t
+  glm::vec3 origin=glm::vec3(0.0f,0.0f,0.0f);
+  glm::vec3 leftbottomfront= glm::vec3(-0.5f,-0.5f,0.5f);
+  glm::vec3 rightbottomfront= glm::vec3(0.5f,-0.5f,0.5f);
+  glm::vec3 lefttopfront= glm::vec3(-0.5f,0.5f,0.5f);
+  glm::vec3 lefttopback= glm::vec3(-0.5f,0.5f,-0.5f);
+  float t;
+  if((abs(ro.x)<0.5f)&&(abs(ro.y)<0.5f)&&(abs(ro.z)<0.5f)){
+
+	  float xcount=-1;
+	  float ycount=-1;
+	  float zcount=-1;
+	  float tcount=FLT_MAX;
+	  if(rd.x>0){
+		xcount= (rightbottomfront.x-ro.x)/rd.x;
+		}else{
+		xcount= (leftbottomfront.x-ro.x)/rd.x;
+		}
+
+	  if(rd.y> 0){
+		ycount=(lefttopfront.y-ro.y)/rd.y;
+	  }else{
+		ycount= (leftbottomfront.y-ro.y)/rd.y;
+	  }
+
+	  if(rd.z>0){
+		zcount= (lefttopfront.z-ro.z)/rd.z;
+	 }else{
+		zcount=(lefttopback.z-ro.z)/rd.z;
+	  }
+
+	  if(xcount<tcount){
+		  tcount=xcount;
+	    if(rd.x>0){
+			localCenter=glm::vec3(0.5f,0.0f,0.0f);
+		}else{
+			localCenter=glm::vec3(-0.5f,0.0,0.0f);
+		}
+	  }
+
+	  if(ycount<tcount){
+		  tcount=ycount;
+	    if(rd.y>0){
+			localCenter=glm::vec3(0.0f,0.5f,0.0f);
+		}else{
+			localCenter=glm::vec3(0.0f,-0.5f,0.0f);
+		}
+	  }
+	  if(zcount<tcount){
+		  tcount=zcount;
+	    if(rd.z>0){
+			localCenter=glm::vec3(0.0f,0.0f,0.5f);
+		}else{
+			localCenter=glm::vec3(0.0f,0.0,-0.5f);
+		}
+	  }
+	  t=tcount;
+
+	 }else{
+  
 
 
+  float xnear=FLT_MIN ;
+	float xfar=FLT_MAX;
+	float ynear=FLT_MIN ;
+	float yfar=FLT_MAX;
+	float znear=FLT_MIN ;
+	float zfar=FLT_MAX;
+	float tnear=FLT_MIN ;
+	float tfar=FLT_MAX;
+
+	
+
+	if(rd.x>0){
+		xfar= (rightbottomfront.x-ro.x)/rd.x;
+		xnear=(leftbottomfront.x-ro.x)/rd.x;
+	}else{
+		xfar= (leftbottomfront.x-ro.x)/rd.x;
+	    xnear= (rightbottomfront.x-ro.x)/rd.x;
+	}
+
+	if(rd.y> 0){
+		ynear= (leftbottomfront.y-ro.y)/rd.y;
+		yfar=(lefttopfront.y-ro.y)/rd.y;
+	}else{
+		ynear= (lefttopfront.y-ro.y)/rd.y;
+	    yfar= (leftbottomfront.y-ro.y)/ rd.y;
+	}
+
+	if((xnear>yfar)||(ynear>xfar))return -1;
+	if(xnear>tnear){
+		tnear=xnear;
+		if(rd.x>0){
+			localCenter=glm::vec3(-0.5f,0.0f,0.0f);
+		}else{
+			localCenter=glm::vec3(0.5f,0.0,0.0f);
+		}
+
+	}
+	if(xfar<tfar)tfar=xfar;
+	if(ynear>tnear){
+		tnear=ynear;
+			if(rd.y>0){
+			localCenter=glm::vec3(0.0f,-0.5f,0.0f);
+		}else{
+			localCenter=glm::vec3(0.0f,0.5f,0.0f);
+		}
+
+	}
+    if(yfar<tfar)tfar=yfar;
+	if(tnear>tfar)return -1;
+	else if(tfar<0)return -1;
+	if(rd.z>0){
+		zfar= (lefttopfront.z-ro.z)/rd.z;
+		znear=(lefttopback.z-ro.z)/rd.z;
+	}else{
+		znear = (lefttopfront.z-ro.z)/rd.z;
+		zfar =(lefttopback.z-ro.z)/rd.z;
+	}
+	if(znear>tnear){
+		tnear=znear;
+		if(rd.z>0){
+			localCenter=glm::vec3(0.0f,0.0f,-0.5f);
+		}else{
+			localCenter=glm::vec3(0.0f,0.0f,0.5f);
+		}
+	}
+	if(zfar<tfar)tfar=zfar;
+	if(tnear>tfar)return -1;
+	else if(tfar<0)return -1;
+	t=tnear;
+ }
+   
+  //transform from object coords to world coords
+  glm::vec3 realIntersectionPoint = multiplyMV(box.transform, glm::vec4(getPointOnRay(rt, t), 1.0));
+  glm::vec3 realOrigin = multiplyMV(box.transform, glm::vec4(0,0,0,1));
+  glm::vec3 realCenter= multiplyMV(box.transform, glm::vec4(localCenter.x,localCenter.y,localCenter.z,1));
 
-    normal = multiplyMV(box.transform, glm::vec4(currentNormal,0.0));
-    return glm::length(intersectionPoint-ro.origin);
+  intersectionPoint = realIntersectionPoint;
+  normal = glm::normalize(realCenter - realOrigin);   
+        
+  return glm::length(r.origin - realIntersectionPoint);
 }
 
 //LOOK: Here's an intersection test example from a sphere. Now you just need to figure out cube and, optionally, triangle.
@@ -172,27 +232,55 @@ __host__ __device__  float sphereIntersectionTest(staticGeom sphere, ray r, glm:
   glm::vec3 ro = multiplyMV(sphere.inverseTransform, glm::vec4(r.origin,1.0f));
   glm::vec3 rd = glm::normalize(multiplyMV(sphere.inverseTransform, glm::vec4(r.direction,0.0f)));
 
+  float t = 0;
   ray rt; rt.origin = ro; rt.direction = rd;
+  float distance=glm::length(ro);
+  if(abs(distance-0.5f)<1e-3){
+	  t=0;
+  }else if(distance<0.5){
+	  float cosangle=glm::dot(-ro,rd);
+	  if(abs(cosangle/glm::length(ro)/glm::length(rd)-1)<1e-3)
+	  {
+		  t=radius+distance;
+	  }else if(abs(cosangle/glm::length(ro)/glm::length(rd)+1)<1e-3)
+	  {
+		  t=radius-distance;
+	  } else{
+		  float res=pow(radius,2)-pow(distance,2);
+		  float squareRoot=(pow(2.0f*distance*cosangle,2.0f)-4*sqrt(res))/4.0f;
+		  float firstTerm=-distance*cosangle;
+		  float t1=firstTerm+sqrt(squareRoot);
+		  float t2=firstTerm-sqrt(squareRoot);
+		  if((t1<0)&&(t2<0)){
+			  return -1;
+		  }else if((t1>0)&&(t2>0)){
+			  t=min(t1,t2);
+		  }else{
+			  t=max(t1,t2);
+		  }
+	  }
+
+  }else{
+	  float vDotDirection = glm::dot(rt.origin, rt.direction);
+	  float radicand = vDotDirection * vDotDirection - (glm::dot(rt.origin, rt.origin) - pow(radius, 2));
+	  if (radicand < 0){
+		return -1;
+	  }
   
-  float vDotDirection = glm::dot(rt.origin, rt.direction);
-  float radicand = vDotDirection * vDotDirection - (glm::dot(rt.origin, rt.origin) - pow(radius, 2));
-  if (radicand < 0){
-    return -1;
-  }
-  
-  float squareRoot = sqrt(radicand);
-  float firstTerm = -vDotDirection;
-  float t1 = firstTerm + squareRoot;
-  float t2 = firstTerm - squareRoot;
+	  float squareRoot = sqrt(radicand);
+	  float firstTerm = -vDotDirection;
+	  float t1 = firstTerm + squareRoot;
+	  float t2 = firstTerm - squareRoot;
   
-  float t = 0;
-  if (t1 < 0 && t2 < 0) {
-      return -1;
-  } else if (t1 > 0 && t2 > 0) {
-      t = min(t1, t2);
-  } else {
-      t = max(t1, t2);
-  }     
+
+	  if (t1 < 0 && t2 < 0) {
+		  return -1;
+	  } else if (t1 > 0 && t2 > 0) {
+		  t = min(t1, t2);
+	  } else {
+		  t = max(t1, t2);
+	  }  
+  }
 
   glm::vec3 realIntersectionPoint = multiplyMV(sphere.transform, glm::vec4(getPointOnRay(rt, t), 1.0));
   glm::vec3 realOrigin = multiplyMV(sphere.transform, glm::vec4(0,0,0,1));
@@ -201,6 +289,7 @@ __host__ __device__  float sphereIntersectionTest(staticGeom sphere, ray r, glm:
   normal = glm::normalize(realIntersectionPoint - realOrigin);   
         
   return glm::length(r.origin - realIntersectionPoint);
+
 }
 
 //returns x,y,z half-dimensions of tightest bounding box
@@ -261,28 +350,29 @@ __host__ __device__ glm::vec3 getRandomPointOnCube(staticGeom cube, float random
        
 }
 
+//TODO: IMPLEMENT THIS FUNCTION
 //Generates a random point on a given sphere
 __host__ __device__ glm::vec3 getRandomPointOnSphere(staticGeom sphere, float randomSeed){
 
-  float radius=.5f;
-  thrust::default_random_engine rng(hash(randomSeed));
-  thrust::uniform_real_distribution<float> u01(0,1);
-  thrust::uniform_real_distribution<float> u02(-0.5,0.5);
-
-  glm::vec3 point = glm::vec3(0,0,0);
-  float x=(float)u02(rng);
-  float y=(float)u02(rng);
-  float z=0;
-  float russianRoulette = (float)u01(rng); 
-  if(russianRoulette<0.5){
-    z=(float)sqrt(radius*radius-x*x-y*y);
-  }else
-    z=-(float)sqrt(radius*radius-x*x-y*y);
-
-  point=glm::vec3(x,y,z);
-  glm::vec3 randPoint = multiplyMV(sphere.transform, glm::vec4(point,1.0f));
-
-  return randPoint;
+	float radius=.5f;
+    thrust::default_random_engine rng(hash(randomSeed));
+    thrust::uniform_real_distribution<float> u01(0,1);
+    thrust::uniform_real_distribution<float> u02(-0.5,0.5);
+
+	glm::vec3 point = glm::vec3(0,0,0);
+    float x=(float)u02(rng);
+	float y=(float)u02(rng);
+	float z=0;
+	float russianRoulette = (float)u01(rng); 
+	if(russianRoulette<0.5){
+		z=(float)sqrt(radius*radius-x*x-y*y);
+	}else
+		z=-(float)sqrt(radius*radius-x*x-y*y);
+
+	point=glm::vec3(x,y,z);
+	glm::vec3 randPoint = multiplyMV(sphere.transform, glm::vec4(point,1.0f));
+
+    return randPoint;
 }
 
 #endif
\ No newline at end of file
diff --git a/src/main.cpp b/src/main.cpp
index 4e94892..9504410 100755
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -52,7 +52,7 @@ int main(int argc, char** argv){
   renderCam = &renderScene->renderCam;
   width = renderCam->resolution[0];
   height = renderCam->resolution[1];
-
+  firstRays=new ray[width*height];
   if(targetFrame>=renderCam->frames){
     cout << "Warning: Specified target frame is out of range, defaulting to frame 0." << endl;
     targetFrame = 0;
@@ -90,7 +90,8 @@ int main(int argc, char** argv){
   #else
 	  glutDisplayFunc(display);
 	  glutKeyboardFunc(keyboard);
-
+	  glutMouseFunc(mouse);
+	  glutMotionFunc(motion);
 	  glutMainLoop();
   #endif
   return 0;
@@ -105,7 +106,7 @@ void runCuda(){
   // Map OpenGL buffer object for writing from CUDA on a single GPU
   // No data is moved (Win & Linux). When mapped to CUDA, OpenGL should not use this buffer
   
-  if(iterations<renderCam->iterations){
+  if(iterations<(int)renderCam->iterations){
     uchar4 *dptr=NULL;
     iterations++;
     cudaGLMapBufferObject((void**)&dptr, pbo);
@@ -114,16 +115,16 @@ void runCuda(){
     geom* geoms = new geom[renderScene->objects.size()];
     material* materials = new material[renderScene->materials.size()];
     
-    for(int i=0; i<renderScene->objects.size(); i++){
+    for(unsigned int i=0; i<renderScene->objects.size(); i++){
       geoms[i] = renderScene->objects[i];
     }
-    for(int i=0; i<renderScene->materials.size(); i++){
+    for(unsigned int i=0; i<renderScene->materials.size(); i++){
       materials[i] = renderScene->materials[i];
     }
     
-  
+
     // execute the kernel
-    cudaRaytraceCore(dptr, renderCam, targetFrame, iterations, materials, renderScene->materials.size(), geoms, renderScene->objects.size() );
+    cudaRaytraceCore(dptr, renderCam, targetFrame, iterations, materials, renderScene->materials.size(), geoms, renderScene->objects.size(),firstRays);
     
     // unmap buffer object
     cudaGLUnmapBufferObject(pbo);
@@ -220,12 +221,92 @@ void runCuda(){
 
 	void keyboard(unsigned char key, int x, int y)
 	{
-		std::cout << key << std::endl;
+		//std::cout << key << std::endl;
 		switch (key) 
 		{
 		   case(27):
 			   exit(1);
 			   break;
+
+		   case 'w':
+			   (renderCam->positions)->y+=1.0f;
+			   deleteImage();
+			   break;
+		   case 's':
+			   (renderCam->positions)->y-=1.0f;
+			   deleteImage();
+			   break;
+
+		   case 'a':
+			   (renderCam->positions)->x+=1.0f;
+			   deleteImage();
+			   break;
+		   case 'd':
+			   (renderCam->positions)->x-=1.0f;
+			   deleteImage();
+			   break;
+		  
+		   case 'q':
+			   (renderCam->positions)->z+=1.0f;
+			   deleteImage();
+			   break;
+		   case 'e':
+			   (renderCam->positions)->z-=1.0f;
+			   deleteImage();
+			   break;
+		}
+
+		glutPostRedisplay();
+	}
+
+	enum actions{MOVE,FOCUS,ZOME,NONE};
+	static GLint        action=NONE;
+	static int     xStart = 0.0, yStart = 0.0;
+
+	void mouse(int button, int state, int x, int y){
+	 
+		 if(state == GLUT_DOWN)
+		{
+			if(button == GLUT_LEFT_BUTTON)
+			{
+				action=MOVE;
+			}else if(button == GLUT_MIDDLE_BUTTON)
+			{
+				action=FOCUS;
+			}else if(button == GLUT_RIGHT_BUTTON)
+			{
+				action=ZOME;
+			}
+			xStart=x;
+			yStart=y;		  
+		}
+		else
+		{
+			action=NONE;
+		}
+	}
+
+	void motion(int x,int y){
+		switch(action){
+		case MOVE:
+			renderCam->positions->x-=(float)(x-xStart)*0.01f;
+			renderCam->positions->y+=(float)(y-yStart)*0.01f;
+			deleteImage();
+			break;
+		case ZOME:
+			renderCam->positions->z+=float(y-yStart)*0.01f;
+			deleteImage();
+			break;	
+		}
+		xStart=x;
+		yStart=y;
+		glutPostRedisplay();
+	}
+
+	void deleteImage(){
+		for(int i=0; i<renderCam->resolution.x*renderCam->resolution.y; i++){
+		renderCam->image[i] = glm::vec3(0,0,0);
+		iterations=0;
 		}
 	}
 
diff --git a/src/main.h b/src/main.h
index 55daf50..b1f2263 100755
--- a/src/main.h
+++ b/src/main.h
@@ -38,6 +38,7 @@ using namespace std;
 //----------PATHTRACER-----------
 //-------------------------------
 
+ray* firstRays;
 scene* renderScene;
 camera* renderCam;
 int targetFrame;
@@ -78,6 +79,9 @@ void runCuda();
 #else
 	void display();
 	void keyboard(unsigned char key, int x, int y);
+	void mouse(int button, int state, int x, int y);
+	void motion(int x,int y); 
+	void deleteImage();
 #endif
 
 //-------------------------------
diff --git a/src/raytraceKernel.cu b/src/raytraceKernel.cu
index d473c89..fa67eee 100755
--- a/src/raytraceKernel.cu
+++ b/src/raytraceKernel.cu
@@ -4,18 +4,30 @@
 //       Rob Farber for CUDA-GL interop, from CUDA Supercomputing For The Masses: http://www.drdobbs.com/architecture-and-design/cuda-supercomputing-for-the-masses-part/222600097
 //       Peter Kutz and Yining Karl Li's GPU Pathtracer: http://gpupathtracer.blogspot.com/
 //       Yining Karl Li's TAKUA Render, a massively parallel pathtracing renderer: http://www.yiningkarlli.com
-
+#include <thrust/scan.h>
 #include <stdio.h>
 #include <cuda.h>
 #include <cmath>
 #include "sceneStructs.h"
 #include <cutil_math.h>
+#include "glm/glm.hpp"
 #include "utilities.h"
 #include "raytraceKernel.h"
 #include "intersections.h"
 #include "interactions.h"
 #include <vector>
-#include "glm/glm.hpp"
+#include "cuda.h"
+#include "cuda_runtime.h"
+#include "device_launch_parameters.h"
+#include <ctime>
+
+
+
+//#define DEPTHOFFIELD
+//#define SUPERSAMPLING
+float Lensdistance=5;
+int NumberOfSampling=5;
+
 
 void checkCUDAError(const char *msg) {
   cudaError_t err = cudaGetLastError();
@@ -31,49 +43,130 @@ __host__ __device__ glm::vec3 generateRandomNumberFromThread(glm::vec2 resolutio
   int index = x + (y * resolution.x);
    
   thrust::default_random_engine rng(hash(index*time));
-  thrust::uniform_real_distribution<float> u01(0,1);
+  thrust::uniform_real_distribution<float> u01(-1,1);
 
   return glm::vec3((float) u01(rng), (float) u01(rng), (float) u01(rng));
 }
 
-//Kernel that does the initial raycast from the camera and caches the result. "First bounce cache, second bounce thrash!"
-__host__ __device__ ray raycastFromCameraKernel(glm::vec2 resolution, float time, int x, int y, glm::vec3 eye, glm::vec3 view, glm::vec3 up, glm::vec2 fov){
-   
-  int index = x + (y * resolution.x);
-   
-  thrust::default_random_engine rng(hash(index*time));
-  thrust::uniform_real_distribution<float> u01(0,1);
-  
-  //standard camera raycast stuff
-  glm::vec3 E = eye;
-  glm::vec3 C = view;
-  glm::vec3 U = up;
-  float fovx = fov.x;
-  float fovy = fov.y;
-  
-  float CD = glm::length(C);
-  
-  glm::vec3 A = glm::cross(C, U);
-  glm::vec3 B = glm::cross(A, C);
-  glm::vec3 M = E+C;
-  glm::vec3 H = (A*float(CD*tan(fovx*(PI/180))))/float(glm::length(A));
-  glm::vec3 V = (B*float(CD*tan(-fovy*(PI/180))))/float(glm::length(B));
-  
-  float sx = (x)/(resolution.x-1);
-  float sy = (y)/(resolution.y-1);
-  
-  glm::vec3 P = M + (((2*sx)-1)*H) + (((2*sy)-1)*V);
-  glm::vec3 PmE = P-E;
-  glm::vec3 R = E + (float(200)*(PmE))/float(glm::length(PmE));
-  
-  glm::vec3 direction = glm::normalize(R);
-  //major performance cliff at this point, TODO: find out why!
-  ray r;
-  r.origin = eye;
-  r.direction = direction;
-  return r;
+__global__ void calculateRaycastFromCameraKernel(cameraData cam, float time,ray* rayArray){
+
+	
+	ray r;
+	r.origin = cam.position;
+	int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+	int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+	int index = x + (y * cam.resolution.x);
+
+	thrust::default_random_engine rng(hash((int(index)*time)));
+	thrust::uniform_real_distribution<float> u01(0,3);
+	thrust::uniform_real_distribution<float> u02(-1,1);
+	//float noisex =((float)u01(rng))*0.5f;
+	//float noisey =((float)u01(rng))*0.5f;
+	float noisex ,noisey;
+	float dt=1.0f/6.0f;
+	float dt2=1.0/3.0f;
+	float russianRoulette = (float)u01(rng);
+	float random2=(float)u02(rng);
+	if(russianRoulette<0.33f){
+		noisex=-dt2+((float)u02(rng))*dt;
+		noisey=dt2+((float)u02(rng))*dt;;
+	}else if(russianRoulette<0.67f){
+		noisex=((float)u02(rng))*dt;
+		noisey=dt2+((float)u02(rng))*dt;
+	}else if(russianRoulette<1.0f){
+		noisex=dt2+((float)u02(rng))*dt;
+		noisey=dt2+((float)u02(rng))*dt;
+	}else if(russianRoulette<1.33f){
+		noisex=-dt2+((float)u02(rng))*dt;
+		noisey=((float)u02(rng))*dt;;
+	}else if(russianRoulette<1.67f){
+		noisex=((float)u02(rng))*dt;
+		noisey=((float)u02(rng))*dt;
+	}else if(russianRoulette<2.0f){
+		noisex=dt2+((float)u02(rng))*dt;
+		noisey=((float)u02(rng))*dt;
+	}if(russianRoulette<2.33f){
+		noisex=-dt2+((float)u02(rng))*dt;
+		noisey=-dt2+((float)u02(rng))*dt;;
+	}else if(russianRoulette<2.67f){
+		noisex=((float)u02(rng))*dt;
+		noisey=-dt2+((float)u02(rng))*dt;
+	}else if(russianRoulette<3.0f){
+		noisex=dt2+((float)u02(rng))*dt;
+		noisey=-dt2+((float)u02(rng))*dt;
+	}
+
+
+	if((x<=cam.resolution.x )&&( y<=cam.resolution.y)){
+    float y1=cam.resolution.y-y;
+	float x1=cam.resolution.x-x;
+	glm::vec3 A = glm::cross(cam.view,cam.up); //A= view^up
+	float ALength=glm::length(A);
+	glm::vec3 B =  glm::cross(A,cam.view);	//B <- A * C
+	float BLength=glm::length(B);
+    glm::vec3 M = cam.position + cam.view;	//M=E+C
+	float viewLength=glm::length(cam.view);
+	glm::vec3 H = A*viewLength * (float)tan(cam.fov.x*(PI/180.0f))/ ALength; //H <- (A|C|tan)/|A|
+	glm::vec3 V = B*viewLength *(float)tan(cam.fov.y*(PI/180.0f)) / BLength;   // V <- (B|C|tan)/|B|  
+	//glm::vec3 P=M+(2*((float)x1/(float)(cam.resolution.x-1))-1)*H+(2*((float)y1/(float)(cam.resolution.y-1))-1)*V;
+	glm::vec3 P=M+(2*(float)(x1+noisex)/(float)(cam.resolution.x-1)-1)*H+(2*(float)(y1+noisey)/(float)(cam.resolution.y-1)-1)*V;
+	glm::vec3 D=P-cam.position;
+	r.direction=glm::normalize(D);
+    rayArray[index]=r;
+	}
+	return;
 }
 
+__global__ void calculateRaycastFromCameraKernelSuperSampling(cameraData cam, float time,int sampleround,ray* rayArray){
+
+	ray r;
+	r.origin = cam.position;
+	int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+	int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+	int index = x + (y * cam.resolution.x);
+
+	thrust::default_random_engine rng(hash(sampleround*index*time));
+	thrust::uniform_real_distribution<float> u01(0,1);
+	thrust::uniform_real_distribution<float> u02(-1,1);
+	//float noisex =((float)u01(rng))*0.45f;
+	//float noisey =((float)u01(rng))*0.45f;
+	float noisex ,noisey;
+	float russianRoulette = (float)u01(rng);
+	if(russianRoulette<0.25){
+		noisex=0.25+((float)u02(rng))*0.125f;
+		noisey=0.25+((float)u02(rng))*0.125f;;
+	}else if(russianRoulette<0.5){
+		noisex=-0.25+((float)u02(rng))*0.125f;
+		noisey=0.25+((float)u02(rng))*0.125f;
+	}else if(russianRoulette<0.75){
+		noisex=0.25+((float)u02(rng))*0.125f;
+		noisey=-0.25+((float)u02(rng))*0.125f;
+	}else{
+		noisex=-0.25+((float)u02(rng))*0.125f;
+		noisey=-0.25+((float)u02(rng))*0.125f;
+	}
+
+
+
+	if((x<=cam.resolution.x )&&( y<=cam.resolution.y)){
+    float y1=cam.resolution.y-y;
+	float x1=cam.resolution.x-x;
+	glm::vec3 A = glm::cross(cam.view,cam.up); //A= view^up
+	float ALength=glm::length(A);
+	glm::vec3 B =  glm::cross(A,cam.view);	//B <- A * C
+	float BLength=glm::length(B);
+    glm::vec3 M = cam.position + cam.view;	//M=E+C
+	float viewLength=glm::length(cam.view);
+	glm::vec3 H = A*viewLength * (float)tan(cam.fov.x*(PI/180.0f))/ ALength; //H <- (A|C|tan)/|A|
+	glm::vec3 V = B*viewLength *(float)tan(cam.fov.y*(PI/180.0f)) / BLength;   // V <- (B|C|tan)/|B|  
+	//glm::vec3 P=M+(2*((float)x1/(float)(cam.resolution.x-1))-1)*H+(2*((float)y1/(float)(cam.resolution.y-1))-1)*V;
+	glm::vec3 P=M+(2*(float)(x1+noisex)/(float)(cam.resolution.x-1)-1)*H+(2*(float)(y1+noisey)/(float)(cam.resolution.y-1)-1)*V;
+	glm::vec3 D=P-cam.position;
+	r.direction=glm::normalize(D);
+    rayArray[index]=r;
+	}
+	return;
+}
 //Kernel that blacks out a given image buffer
 __global__ void clearImage(glm::vec2 resolution, glm::vec3* image){
     int x = (blockIdx.x * blockDim.x) + threadIdx.x;
@@ -118,62 +211,275 @@ __global__ void sendImageToPBO(uchar4* PBOpos, glm::vec2 resolution, glm::vec3*
   }
 }
 
-//TODO: IMPLEMENT THIS FUNCTION
-//Core raytracer kernel
-__global__ void raytraceRay(glm::vec2 resolution, float time, cameraData cam, int rayDepth, glm::vec3* colors, 
-                            staticGeom* geoms, int numberOfGeoms, material* materials, int numberOfMaterials){
+__global__ void raytracefromCameraKernel(glm::vec2 resolution, float time, cameraData cam, int rayDepth, glm::vec3* colors, 
+                            staticGeom* geoms, int  numberOfGeoms,material* materials,ray* cudaFirstRays,rayData* rayList){
 
   int x = (blockIdx.x * blockDim.x) + threadIdx.x;
   int y = (blockIdx.y * blockDim.y) + threadIdx.y;
   int index = x + (y * resolution.x);
 
-  ray r = raycastFromCameraKernel(resolution, time, x, y, cam.position, cam.view, cam.up, cam.fov);
-
-  if((x<=resolution.x && y<=resolution.y)){
-
-    float MAX_DEPTH = 100000000000000000;
-    float depth = MAX_DEPTH;
-
-    for(int i=0; i<numberOfGeoms; i++){
-        glm::vec3 intersectionPoint;
-        glm::vec3 intersectionNormal;
-       if(geoms[i].type==SPHERE){
-           depth = sphereIntersectionTest(geoms[i], r, intersectionPoint, intersectionNormal);
-        }else if(geoms[i].type==CUBE){
-            depth = boxIntersectionTest(geoms[i], r, intersectionPoint, intersectionNormal);
-        }else if(geoms[i].type==MESH){
-            //triangle tests go here
-        }else{
-            //lol?
-        }
-        if(depth<MAX_DEPTH && depth>-EPSILON){
-          MAX_DEPTH = depth;
-          colors[index] = materials[geoms[i].materialid].color;
-        }
-    }
+  if((x>=resolution.x )||( y>=resolution.y))return;
+  if(index>=(resolution.x*resolution.y))return;
+  ray r=cudaFirstRays[index];
+ 
+  glm::vec3 finalColor=glm::vec3(0.0f,0.0f,0.0f);   
+	//find first intersection
+	float distance=-1.0f;
+	glm::vec3 interestPoint=glm::vec3(0,0,0);
+	glm::vec3 normal=glm::vec3(0,0,0);
+	int geoID=-1;
+	for(int i=0; i<numberOfGeoms; i++){
+		float tempdistance=-1.0;
+		glm::vec3 tempInterestPoint=glm::vec3(0,0,0);
+	    glm::vec3 tempNormal=glm::vec3(0,0,0);
+		if(geoms[i].type==SPHERE){
+			tempdistance=sphereIntersectionTest(geoms[i],r,tempInterestPoint,tempNormal);
+		}else if(geoms[i].type==CUBE){
+			tempdistance=boxIntersectionTest(geoms[i],r,tempInterestPoint,tempNormal);
+		}
 
+		if((abs(distance+1.0f)<1e-3)&&(tempdistance>0.001f)){
+			distance=tempdistance;
+			normal=tempNormal;
+			interestPoint=tempInterestPoint;
+			geoID=i;
+		}else if((tempdistance>0.001f)&&(tempdistance<distance)){
+			distance=tempdistance;
+			normal=tempNormal;
+			interestPoint=tempInterestPoint;
+			geoID=i;
+		}
 
+		}
+
+	rayData nextray;
+	//can not find intersection ,ray ends
+	if(geoID==-1){
+		finalColor=glm::vec3(0,0,0);
+		nextray.dirty=0;
+	}else {
+			material m=materials[geoms[geoID].materialid];
+			if(m.emittance>0){      ///light source
+			
+				 finalColor=m.emittance*m.color;
+				 nextray.dirty=0;
+			}else{
+
+					glm::vec3 emittedColor=glm::vec3(0.0f,0.0f,0.0f);
+					glm::vec3 unabsorbedColor=glm::vec3(0.0f,0.0f,0.0f);
+					
+					AbsorptionAndScatteringProperties currentAbsorptionAndScattering;
+					calculateBSDF(cam.position,r,interestPoint,normal,emittedColor, currentAbsorptionAndScattering,
+						finalColor, unabsorbedColor, m, (float)index*time, nextray);
+					nextray.dirty=1;
+	
+			 }
+	}
+
+
+	 colors[index] =finalColor;
+	 nextray.x=x;
+	 nextray.y=y;
+	 rayList[index]=nextray;
+
+}
+
+
+__global__ void iterationRaytrace(glm::vec2 resolution, float time, cameraData cam, int rayDepth, glm::vec3* colors, 
+	staticGeom* geoms, int numberOfGeoms,material* materials,rayData* rayList,int maxnum,int WIDTH,int currentDepth){
+ 
+  int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+  int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+  int threadIndex=x+WIDTH*y;
+
+  if((x>=WIDTH )||( y>=WIDTH))return;
+  if(threadIndex>=maxnum) return;
+  rayData rd=rayList[threadIndex];
+  int colorIndex = rd.x + (rd.y * resolution.x);
+  ray r=rd.newray;
+
+
+  glm::vec3 finalColor=glm::vec3(0.0f,0.0f,0.0f);
+
+	//find first intersection
+	float distance=-1.0f;
+	glm::vec3 interestPoint=glm::vec3(0,0,0);
+	glm::vec3 normal=glm::vec3(0,0,0);
+	int geoID=-1;
+	for(int i=0; i<numberOfGeoms; i++){
+		float tempdistance=-1.0;
+		glm::vec3 tempInterestPoint=glm::vec3(0,0,0);
+	    glm::vec3 tempNormal=glm::vec3(0,0,0);
+		if(geoms[i].type==SPHERE){
+			tempdistance=sphereIntersectionTest(geoms[i],r,tempInterestPoint,tempNormal);
+		}else if(geoms[i].type==CUBE){
+			tempdistance=boxIntersectionTest(geoms[i],r,tempInterestPoint,tempNormal);
+		}
+
+		if((abs(distance+1.0f)<1e-3)&&(tempdistance>0.001f)){
+			distance=tempdistance;
+			normal=tempNormal;
+			interestPoint=tempInterestPoint;
+			geoID=i;
+		}else if((tempdistance>0.001f)&&(tempdistance<distance)){
+			distance=tempdistance;
+			normal=tempNormal;
+			interestPoint=tempInterestPoint;
+			geoID=i;
+		}
+
+		}
+
+	rayData nextray;
+	
+	//can not find intersection ,ray ends
+	 if(geoID==-1){
+		finalColor=glm::vec3(0,0,0);
+		nextray.dirty=0;
+	}else {
+			material m=materials[geoms[geoID].materialid];
+			if(m.emittance>0){      ///light source
+				finalColor=m.emittance*m.color;
+				 nextray.dirty=0;
+			}else{
+
+					glm::vec3 emittedColor=glm::vec3(0.0f,0.0f,0.0f);
+					glm::vec3 unabsorbedColor=glm::vec3(0.0f,0.0f,0.0f);
+					
+					AbsorptionAndScatteringProperties currentAbsorptionAndScattering;
+					calculateBSDF(cam.position,r,interestPoint,normal,emittedColor, currentAbsorptionAndScattering,
+						finalColor, unabsorbedColor, m, (float)threadIndex*time*currentDepth, nextray);
+					nextray.dirty=1;
+	
+			 }
+	}
+
+
+	 glm::vec3 precolor=colors[colorIndex];
+	 colors[colorIndex] = glm::vec3(precolor.x*finalColor.x,precolor.y*finalColor.y,precolor.z*finalColor.z);
+	 nextray.x=rd.x;
+	 nextray.y=rd.y;
+	 rayList[threadIndex]=nextray;
+
+}
+
+
+ __global__ void mergeImage(glm::vec2 resolution,glm::vec3* previousColors,glm::vec3* currentColors,float time){
+  int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+  int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+  int index = x + (y * resolution.x);
+   if((x<=resolution.x) && (y<=resolution.y)){
+	    glm::vec3 currentColor=currentColors[index];
+		 glm::vec3 previousColor=previousColors[index];
 
-    //colors[index] = generateRandomNumberFromThread(resolution, time, x, y);
+	   currentColors[index]=currentColor/time+previousColor*(time-1.0f)/time;
    }
+
+ }
+
+ __global__ void scanRound( glm::vec2 dim,int* previewArray,int* nextarray, int boundary,int flag){
+
+	int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+	int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+	int index = x + (y * dim.x);
+
+	if((x>=dim.x)||(y>=dim.y))return;
+	if(index<boundary){
+	if(index>=flag)
+			nextarray[index]=previewArray[index]+previewArray[index-flag];
+		else
+			nextarray[index]=previewArray[index];
+	}
+ }
+
+ 
+ __global__ void copyBack( glm::vec2 dim,int* previewArray,int* nextarray, int boundary){
+
+	int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+	int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+	int index = x + (y * dim.x);
+	if((x>=dim.x)||(y>=dim.y))return;
+	if(index<boundary){
+	previewArray[index]=nextarray[index];
+	}
+ }
+
+ __global__ void getValue(glm::vec2 dim, rayData* raydatalist,int* indexlist,int boundary){
+	int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+	int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+	int index = x + (y * dim.x);
+
+	if((x>=dim.x)||(y>=dim.y))return;
+	if(index<boundary) {
+	indexlist[index]=raydatalist[index].dirty;
+	}
+	
+}
+
+__global__ void inclusiveTOexclusive(glm::vec2 dim, rayData* raydatalist,int* indexlist,int boundary){
+	int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+	int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+	int index = x + (y * dim.x);
+	if((x>=dim.x)||(y>=dim.y))return;
+	if(index<boundary) {
+	indexlist[index]-=raydatalist[index].dirty;
+	}
+	
 }
 
 
+void scan(dim3 DimBlock, dim3 DimThread, glm::vec2 dim,rayData* rayList,int* indexArray,int* tempArray, int boundary){
+
+ 
+ getValue<<<DimBlock,DimThread>>>(dim,rayList,indexArray,boundary);
+ checkCUDAError("Kernel failed! scan-1");
+  for(int d=1;d<=ceil((float)log2((float)boundary));d++){  
+		int flag=(int)pow(2.0f,d-1);
+		//std::cout<<"d="<<d<<std::endl;
+		//std::cout<<"flag="<<flag<<std::endl;
+	    scanRound<<<DimBlock,DimThread>>>(dim,indexArray,tempArray,boundary,flag);
+		checkCUDAError("Kernel failed! scan-2-1");
+		copyBack<<<DimBlock,DimThread>>>(dim,indexArray,tempArray,boundary);
+		// cudaMemcpy( indexArray, tempArray, (boundary)*sizeof(int), cudaMemcpyDeviceToDevice);
+		checkCUDAError("Kernel failed! scan-2-2");
+ }	
+
+ inclusiveTOexclusive<<<DimBlock,DimThread>>>(dim,rayList,indexArray,boundary);
+ checkCUDAError("Kernel failed! scan-3");
+
+ }
+
+
+
+__global__ void stringcompaction(glm::vec2 dim,rayData* rayList,rayData *newdataArray,int* indexArray,int maxBoundary){
+
+  int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+  int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+  int index = x + (y * dim.x);
+ if((x>=dim.x)||(y>=dim.y))return;
+  if(index<maxBoundary){
+	 if(rayList[index].dirty){
+	  newdataArray[indexArray[index]]=rayList[index];
+	 }
+	}
+ }
+  
 //TODO: FINISH THIS FUNCTION
 // Wrapper for the __global__ call that sets up the kernel calls and does a ton of memory management
-void cudaRaytraceCore(uchar4* PBOpos, camera* renderCam, int frame, int iterations, material* materials, int numberOfMaterials, geom* geoms, int numberOfGeoms){
+void cudaRaytraceCore(uchar4* PBOpos, camera* renderCam, int frame, int iterations, material* materials, int numberOfMaterials, geom* geoms, int numberOfGeoms, ray* firstRays){
   
-  int traceDepth = 1; //determines how many bounces the raytracer traces
+  int traceDepth =100; //determines how many bounces the raytracer traces
 
   // set up crucial magic
   int tileSize = 8;
   dim3 threadsPerBlock(tileSize, tileSize);
   dim3 fullBlocksPerGrid((int)ceil(float(renderCam->resolution.x)/float(tileSize)), (int)ceil(float(renderCam->resolution.y)/float(tileSize)));
-  
+
   //send image to GPU
   glm::vec3* cudaimage = NULL;
   cudaMalloc((void**)&cudaimage, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3));
   cudaMemcpy( cudaimage, renderCam->image, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3), cudaMemcpyHostToDevice);
+
   
   //package geometry and materials and sent to GPU
   staticGeom* geomList = new staticGeom[numberOfGeoms];
@@ -193,10 +499,31 @@ void cudaRaytraceCore(uchar4* PBOpos, camera* renderCam, int frame, int iteratio
   cudaMalloc((void**)&cudageoms, numberOfGeoms*sizeof(staticGeom));
   cudaMemcpy( cudageoms, geomList, numberOfGeoms*sizeof(staticGeom), cudaMemcpyHostToDevice);
   
-  material* cudamaterials = NULL;
-  cudaMalloc((void**)&cudamaterials, numberOfMaterials*sizeof(material));
-  cudaMemcpy( cudamaterials, materials, numberOfMaterials*sizeof(material), cudaMemcpyHostToDevice);
+   //--------------------------------
+  //package materials
+  material* materialList=new material[numberOfMaterials];
 
+  for(int i=0;i<numberOfMaterials;i++){
+	  material newmaterial;
+	  newmaterial.color=materials[i].color;
+	  newmaterial.specularExponent=materials[i].specularExponent;
+	  newmaterial.specularColor=materials[i].specularColor;
+	  newmaterial.hasReflective=materials[i].hasReflective;
+	  newmaterial.hasRefractive=materials[i].hasRefractive;
+	  newmaterial.indexOfRefraction=materials[i].indexOfRefraction;
+	  newmaterial.hasScatter=materials[i].hasScatter;
+	  newmaterial.absorptionCoefficient=materials[i].absorptionCoefficient;
+	  newmaterial.reducedScatterCoefficient=materials[i].reducedScatterCoefficient;
+	  newmaterial.emittance=materials[i].emittance;
+
+	  materialList[i]=newmaterial;
+  }
+
+  material* cudamatrials=NULL;
+  cudaMalloc((void**)&cudamatrials, numberOfMaterials*sizeof(material));
+  cudaMemcpy( cudamatrials, materialList, numberOfMaterials*sizeof(material), cudaMemcpyHostToDevice);
+	 	  
+  
   //package camera
   cameraData cam;
   cam.resolution = renderCam->resolution;
@@ -205,20 +532,154 @@ void cudaRaytraceCore(uchar4* PBOpos, camera* renderCam, int frame, int iteratio
   cam.up = renderCam->ups[frame];
   cam.fov = renderCam->fov;
 
-  //kernel launches
-  raytraceRay<<<fullBlocksPerGrid, threadsPerBlock>>>(renderCam->resolution, (float)iterations, cam, traceDepth, cudaimage, cudageoms, numberOfGeoms, cudamaterials, 
-                                                                              numberOfMaterials);
 
-  sendImageToPBO<<<fullBlocksPerGrid, threadsPerBlock>>>(PBOpos, renderCam->resolution, cudaimage);
+   //first Rays cudaMemory Pointer
+  ray* cudaFirstRays = NULL;
+  cudaMalloc((void**)&cudaFirstRays, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(ray));
+
+  //saved first ray Color cudaMemory Pointer
+  rayData* cudaRayList = NULL;
+  cudaMalloc((void**)&cudaRayList, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(rayData));
 
+  //scan result cudaMemory Pointer
+  int* scanResultRayList = NULL;
+  cudaMalloc((void**)&scanResultRayList, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(int));
+  
+  //for string compaction
+  int *tempscanResultRayList=NULL;
+  cudaMalloc((void**)&tempscanResultRayList, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(int));
+  rayData* newRayDataList=NULL;
+  cudaMalloc((void**)&newRayDataList, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(rayData));
+#ifdef SUPERSAMPLING
+
+  
+   #ifdef DEPTHOFFIELD
+thrust::default_random_engine rng(hash((float)iterations));
+    thrust::uniform_real_distribution<float> u01(-1,1);
+	float xDist=(float)u01(rng);
+	float yDist=(float)u01(rng);
+
+	float length=abs(glm::dot(glm::vec3(glm::vec3(0,0,0)-cam.position),cam.view));
+	glm::vec3 focalPos=cam.position+cam.view*length;
+	cam.position+=100.0f*glm::vec3(xDist*Lensdistance*1/cam.resolution.x,yDist*Lensdistance*1/cam.resolution.y,0.0f);
+	cam.view=glm::normalize(focalPos-cam.position);
+   #endif
+
+	glm::vec3* cudaimageSuperSamping = NULL;
+	cudaMalloc((void**)&cudaimageSuperSamping, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3));
+	
+	int SampleCount=1;
+	while(SampleCount<=NumberOfSampling){
+
+#endif
+	// save the first Ray Direction
+#ifndef SUPERSAMPLING
+    calculateRaycastFromCameraKernel<<<fullBlocksPerGrid, threadsPerBlock>>>(cam,(float)iterations,cudaFirstRays);
+	//cudaMemcpy(firstRays, cudaFirstRays, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(ray), cudaMemcpyDeviceToHost);
+#else
+	calculateRaycastFromCameraKernelSuperSampling<<<fullBlocksPerGrid, threadsPerBlock>>>(cam,(float)iterations,SampleCount,cudaFirstRays);
+#endif
+   //calculate first ray color
+
+  raytracefromCameraKernel<<<fullBlocksPerGrid, threadsPerBlock>>>(renderCam->resolution, (float)iterations, cam, traceDepth, cudaimage, cudageoms, numberOfGeoms,cudamatrials,cudaFirstRays,cudaRayList);
+
+  checkCUDAError("Kernel failed! 2");
+
+  
+  //scan
+   int maxBoundary=(int)renderCam->resolution.x*(int)renderCam->resolution.y;
+   scan(fullBlocksPerGrid, threadsPerBlock,renderCam->resolution,cudaRayList,scanResultRayList,tempscanResultRayList,maxBoundary);
+   checkCUDAError("Kernel failed! 3");
+
+  int flag=-1;
+  cudaMemcpy(&flag,&(cudaRayList[maxBoundary-1].dirty),sizeof(int),cudaMemcpyDeviceToHost);
+
+  // rays for string compaction
+  stringcompaction<<<fullBlocksPerGrid, threadsPerBlock>>>(renderCam->resolution,cudaRayList,newRayDataList,scanResultRayList,maxBoundary);
+  cudaMemcpy(cudaRayList,newRayDataList,(int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(rayData), cudaMemcpyDeviceToDevice);
+
+  //get number of rays in raylist
+  int numberOfRays=0;
+  cudaMemcpy(&numberOfRays,&(scanResultRayList[maxBoundary-1]),sizeof(int),cudaMemcpyDeviceToHost);
+   if(flag==1)
+	   numberOfRays++;
+  checkCUDAError("Kernel failed! list-newlist");
+
+  //iteration kernel launches
+  int currDepth=1;
+  while((numberOfRays>0)&&(currDepth<=traceDepth)){ 
+	 //std::cout<<"depth="<<currDepth<<std::endl;
+	int length=ceil(sqrt((float)(numberOfRays)));
+    dim3 newthreadsPerBlock(tileSize, tileSize);
+	dim3 newfullBlocksPerGrid((int)ceil(float(length)/float(tileSize)), (int)ceil(float(length)/float(tileSize)));
+	glm::vec2 dim=glm::vec2(length,length);
+
+	iterationRaytrace<<<newfullBlocksPerGrid,newthreadsPerBlock>>>(renderCam->resolution, (float)iterations, cam, traceDepth, cudaimage, cudageoms, numberOfGeoms,cudamatrials,cudaRayList,numberOfRays,length,currDepth);
+	checkCUDAError("Kernel failed! 4-0");
+
+	//get flag
+	cudaMemcpy(&flag,&(cudaRayList[numberOfRays-1].dirty),sizeof(int),cudaMemcpyDeviceToHost);
+	checkCUDAError("Kernel failed! 4-1");
+
+	scan(newfullBlocksPerGrid,newthreadsPerBlock,dim,cudaRayList,scanResultRayList,tempscanResultRayList,numberOfRays);
+	checkCUDAError("Kernel failed! 4-2");
+	
+	//string compaction
+	stringcompaction<<<newfullBlocksPerGrid,newthreadsPerBlock>>>(dim,cudaRayList,newRayDataList,scanResultRayList,numberOfRays);
+	cudaMemcpy(cudaRayList,newRayDataList,(int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(rayData), cudaMemcpyDeviceToDevice);
+
+	//get number of rays
+	cudaMemcpy(&numberOfRays,&scanResultRayList[numberOfRays-1],sizeof(int),cudaMemcpyDeviceToHost);
+	if(flag==1)
+	   numberOfRays++;
+
+	checkCUDAError("Kernel failed! 4-3");
+	//std::cout<<"number of rays"<<numberOfRays<<std::endl; 
+	currDepth++;
+	
+  }
+
+ #ifdef SUPERSAMPLING
+  if(SampleCount!=1)
+	mergeImage<<<fullBlocksPerGrid, threadsPerBlock>>>(renderCam->resolution,cudaimageSuperSamping,cudaimage,(float)SampleCount);
+	checkCUDAError("Kernel failed!supersamping-1");
+
+	cudaMemcpy( cudaimageSuperSamping, cudaimage, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3), cudaMemcpyDeviceToDevice);
+    SampleCount++;
+	}
+#endif
+
+  //combine several iteration together
+
+  // previous image cudaMemory Pointer
+  glm::vec3* previousImage = NULL;
+  cudaMalloc((void**)&previousImage, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3));
+  cudaMemcpy( previousImage, renderCam->image, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3), cudaMemcpyHostToDevice);
+  checkCUDAError("Kernel failed! 6");
+
+  mergeImage<<<fullBlocksPerGrid, threadsPerBlock>>>(renderCam->resolution,previousImage,cudaimage,(float)iterations),
+  checkCUDAError("Kernel failed! 7");
+  
   //retrieve image from GPU
-  cudaMemcpy( renderCam->image, cudaimage, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3), cudaMemcpyDeviceToHost);
+  cudaMemcpy(renderCam->image, cudaimage, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3), cudaMemcpyDeviceToHost);
+
 
+   sendImageToPBO<<<fullBlocksPerGrid, threadsPerBlock>>>(PBOpos, renderCam->resolution, cudaimage);
   //free up stuff, or else we'll leak memory like a madman
   cudaFree( cudaimage );
+  cudaFree(cudaFirstRays);
+  cudaFree(cudaRayList);
+  cudaFree(scanResultRayList);
   cudaFree( cudageoms );
-  cudaFree( cudamaterials );
-  delete [] geomList;
+  cudaFree(cudamatrials);
+  cudaFree(previousImage);
+  cudaFree(tempscanResultRayList);
+  cudaFree(newRayDataList);
+#ifdef SUPERSAMPLING
+  	cudaFree(cudaimageSuperSamping);
+#endif
+  delete geomList;
+  delete materialList;
 
   // make certain the kernel has completed 
   cudaThreadSynchronize();
diff --git a/src/raytraceKernel.h b/src/raytraceKernel.h
index 331e5ce..685d775 100755
--- a/src/raytraceKernel.h
+++ b/src/raytraceKernel.h
@@ -15,6 +15,6 @@
 #include "sceneStructs.h"
 #include <cutil_math.h>
 
-void cudaRaytraceCore(uchar4* pos, camera* renderCam, int frame, int iterations, material* materials, int numberOfMaterials, geom* geoms, int numberOfGeoms);
+void cudaRaytraceCore(uchar4* pos, camera* renderCam, int frame, int iterations, material* materials, int numberOfMaterials, geom* geoms, int numberOfGeoms,ray* firstRays);
 
 #endif
diff --git a/src/sceneStructs.h b/src/sceneStructs.h
index b10f1cf..ea139f4 100755
--- a/src/sceneStructs.h
+++ b/src/sceneStructs.h
@@ -18,6 +18,16 @@ struct ray {
 	glm::vec3 direction;
 };
 
+struct rayData{
+	ray newray;
+	glm::vec3 precolors;
+	float coeff;
+	int dirty;
+	int x;
+	int y;
+};
+
+
 struct geom {
 	enum GEOMTYPE type;
 	int materialid;
@@ -73,4 +83,6 @@ struct material{
 	float emittance;
 };
 
+
+
 #endif //CUDASTRUCTS_H
diff --git a/src/utilities.h b/src/utilities.h
index 5842c33..cb0e89d 100755
--- a/src/utilities.h
+++ b/src/utilities.h
@@ -17,13 +17,13 @@
 #include <vector>
 #include "cudaMat4.h"
 
-const float PI                          =3.1415926535897932384626422832795028841971;
-const float TWO_PI                      =6.2831853071795864769252867665590057683943;
-const float SQRT_OF_ONE_THIRD           =0.5773502691896257645091487805019574556476;
-const float E                           =2.7182818284590452353602874713526624977572;
-const float EPSILON                     =.000000001;
-const float ZERO_ABSORPTION_EPSILON     =0.00001;
-const float RAY_BIAS_AMOUNT             =0.0002;
+#define PI                          3.1415926535897932384626422832795028841971
+#define TWO_PI                      6.2831853071795864769252867665590057683943
+#define SQRT_OF_ONE_THIRD           0.5773502691896257645091487805019574556476
+#define E                           2.7182818284590452353602874713526624977572
+#define EPSILON                     .000000001
+#define ZERO_ABSORPTION_EPSILON     0.00001
+#define RAY_BIAS_AMOUNT             0.0002
 
 namespace utilityCore {
     extern float clamp(float f, float min, float max);