Yan's profilecomputing lifePhotosBlogListsMore ![]() | Help |
|
|
[Computing] C++ => C# => F#, more functional, more parallel (3)In this article, we will see how the functional programming style makes code parallelization extremely easy. First the parallel version of QuickSort, which is quite complex to implement in C++ but very easy in C#, is introduced. Then we will see how to make use of variables’ immutable property of function languages to parallelize code. In the end, let’s visit some useful features in .NET Parallel Extension (.NET PE), although they are less related to functional programming. Every statement is a variableRemember that in the famous C++ parallel library OpenMP, we have the “parallel for” primitive director. Using “#pragma omp parallel for”, we can simply turn a sequential for loop into a parallel one. Extending primitives in a mature language rather than defining a new one is quite reasonable and practical. That reduces the learning cost and makes the existed code still available in the extended language. However, this technique is not always effective, especially in imperative languages. For example, assume we want to implement the parallel version of QuickSort based on existed sequential source code. In C++ and OpenMP style, we may encapsulate the sequential version QuickSort(int *array, int low, int high) and Partition(int *array, int low, int high) into a new parallelized version like that: void ParallelQuickSort(int *pnArray, int nLow, int nHigh) { if (nLow >= nHigh) return; int nPivotIndex = Partition(pnArray, nLow, nHigh); int nNewLow[2], nNewHigh[2]; nNewLow[0] = nLow; nNewLow[1] = nPivotIndex + 1; nNewHigh[0] = nPivotIndex - 1; nNewHigh[1] = nHigh; #pragma omp parallel for nowait for (i = 0; i < 2; i++) QuickSort(pnArray, nNewLow[i], nNewHigh[i]); } This seems solve the problem. Yes, the function is parallel and may get a relative good result in a dual-core machine. But what if the machine has a 4-way CPU? The code still runs in 2 threads, which cannot take advantage of CPU’s potential. Of course we can manually modify the code to make it suitable for a 4-core CPU, but what if the CPU is 8-way or many-core in the future? This reveals the solution’s lack of scalability, which is critical in parallel world. An improved solution is to replace the OpenMP primitive with explicit thread creation statements. Yes, this solves previous problem, as there are more threads, with each thread occupying a core, the potential of multi-core CPUs can be used. But each thread requires a named function to specify its actions, which makes the simple QuickSort algorithm needs 4 to 5 functions to describe. In addition, remember that thread creation and switching are both very expensive. There may be more than 1000 threads running simultaneously when sorting 1 million numbers in a dual-core CPU, where the CPU is busy with switching among them rather than doing the actual work. So finally, this solution will have a bad performance in most occasions and is not practical. If we use a threading pool instead of raw threads to help distribute computation, the frequent threads switching can be avoided. A core will “pull” another thread from threading pool to process until it is free, as Figure 1 shows. Figure 1. Threading pool To this end, the problem finally gets solved in C++’s style. Quite complicated and needs a lot of background knowledge such as threading pool. Worst of all, the OpenMP library seems not providing an explicit threading pool which developers can invoke without directors like “#pragma omp parallel for”. So it is still a hard work to write practical codes. Now let’s take a look at how to implement parallel QuickSort with C# + .NET Parallel Extension (.NET PE): static void QuickSort(int[] items, int low, int high) Very similar to the sequential version. The only difference is the original QuickSort(…) turns into parallel command Task.Create(x => QuickSort(…)), which means this task is managed by threading pool and runs in parallel automatically. No extended #pragma directors, no blunt and unnecessary for loops, and certainly no need to define a totally new language, all the benefits come from the description ability of this language, in which every statement can be treated as a function variable. Just because of this feature, we can easily specify this statement should run in parallel, that one not. We can also implement functions taking functions as parameters, which helps much to build a parallel library with influent communication with users. Vivid present ability, that’s what functional programming brings to us, and builds the fundaments of an easy-to-use parallel library. We can also run this code snippet on an actual multi-core computer. In a Xeon 8-way server, sorting 10,000,000 numbers between 0 and 32767 costs 1187ms by sequential version, while only 170ms by parallel version. The acceleration ratio is about 7.0 rather than 8.0 because: First, QuickSort is a O(nlgn) level algorithm, so the theoretical acceleration ratio is about 9.3 rather than an linear result 8.0 for 10,000,000 numbers. Second, there are always overhead in parallel management, so it rarely reaches theoretical value. Third, this simple parallelization scheme does not use all the computation power, for example, all the other 7 cores has to wait and can do nothing until one core finishes partitioning all the 10,000,000 numbers. This also prevents the performance getting perfect. In conclusion, functional programming style helps to make parallelization existed code very simple and efficient. And the parallel code really works, helping us take full use of computation potential of multi-core CPUs. Immutable!As mentioned in the first article of this series, variables in functional languages are immutable, i.e. they cannot be changed after created. This helps much to parallelize existed code by avoiding writing conflicts and dead lock. In languages having functional elements like C#, such feature is also taken advantage of: string class and LINQ are typical paradigms. For objects with this property, we can write parallel version extremely easy. For example, suppose we want to select all the customers living in UK. The sequential version has been demonstrated before: var UKCustomer = from customer in customers where customer.LivingIn == "UK" select customer; To make it work in parallel, the only thing we need to do is change customers into customers.AsParallel(): var UKCustomer = from customer in customers.AsParallel() where customer.LivingIn == "UK" select customer; The .NET Parallel Extension will do all the implementation and optimization work for you. Thanks to the immutability of variables in LINQ, developers as well as compilers needn’t worry about complex locks, semaphores, etc. So that there can be a simple pattern to parallelize code, which makes an easy-to-use library possible. This pattern is called PLINQ in C#, while in F# where nearly all the variables are immutable, parallelization is even simpler, often we only need to replace “let” primitive with “let!” and everything is done. .NET Parallel ExtensionBesides taking use of functional style to make parallel programming quite easy, .NET PE has other very useful data structures. Let’s take a brief look. FutureFuture is like Task class mentioned before. It creates another “thread” in threading pool and starts it in proper time. However, unlike Task instances, Future instances can return an object as result, which appears as Value property. When some thread tries to refer to the Value property of a Future object, it will get the result immediately if the Future object has finished the computation, or gets blocked until the Future’s work gets done. It provides an alternative to events and semaphores, so that the code can be more readable and error-prone. For example, we can count a tree’s nodes like this: int CountNodes(Tree<int> node) { if (node == null) return 0; var left = Future.Create(() => CountNodes(node.Left)); int right = CountNodes(node.Right); return 1 + left.Value + right; } WriteOnceMany parallel algorithms, especially those allowing concurrent writing, rely on variables that can only be changed once, such as a lot of sorting algorithms. WriteOnce class just provides an implementation of such data structure. BlockingCollectionIn the classical producer/ consumer problem, there is expected a pipeline where all the producers can put their products in and all the consumers can fetch tasks from. In addition, the consumers will get blocked when there is no task in the pipeline. This is exactly BlockCollection does in .NET PE. It provides a thread-safe implementation for concurrently reading and writing, and automatically blocking when it is empty. From the three useful components in .NET PE, we can see that this extension library is from practical programming and consider developers’ real needs, and get a lot of benefits from functional programming style. SummaryEvery statement in a functional language is a variable, so that we can describe our intention and communicate to the compiler with more powerful vocabularies. In addition, as variables in functional languages are immutable, it enables compiler to provide extremely easy way to parallelize code. From these two factors, we can see that functional programming style does help us write parallel code more easily. And that is more functional, more parallel. P.S. Note that parallel programming is quite complex, while this series of articles only cover very little part of it. To write practical parallel codes with scalability, robustness and good performance needs knowledge about parallel algorithm, hardware architecture as well as programming language and extension library. Anyway, functional style does simplify the parallel coding procedure and provide new solutions to some sequential problems, although it has fewer effects on algorithms. [Computing] C++ => C# => F#, more functional, more parallel (2)In this article, I will introduce two sample scenes to demonstrate how we can use functional programming to make the code more intuitive, more elegant and more parallel. Text editingConsider we want to implement some text editing interface on mobile platform. First, some rich text, i.e. the mixture of texts and images are shown on the screen. Then when users tap on some text, an editing panel will pop up, where users can do some editing work. When the editing finishes, users can tap on some button to return to the original interface, where the texts are modified according to the editing. As following flow chart shows.
The obstacle is how event handler B knows which label’s text should he updated. Obviously no one knows this until user taps a label, when the information which label is tapped is delivered to event handler A by OS. Then how to communicate between A and B seems trivial. A member variable will do a good job in C++. Considering in C++ we cannot create functions at runtime, the final architecture will be like this: First, event handlers A and B appear as existed functions in a class, with a member variable L. When A is invoked, he first assigns L as the tapped label, then shows the editing panel. And when B is invoked, i.e. the finish editing button is clicked, he reads text from editing panel, and assign L’s text. Not too complex huh? Let’s see another way in C#, a language with many functional elements. As B needs information undecided until A is invoked, then why not create an event handler in A to do B’s work, dynamically? From this motivation, we have codes below, with no member variables to communicate between A and B: //this is event handler A, where the parameter sender indicates which label is tapped. void Label_Click(object sender, EventArgs e) { //add an editing panel. //... ... //create event handler B finishingButton.Click += (object textBoxSender, EventArgs ea) => { (sender as Control).Text = (textBox as TextBox).Text; //assign text after editing to the sender //please note that sender is not a local variable in event handler B, //but a parameter from event handler A. this.Controls.Remove(outerPanel); outerPanel.Dispose(); //delete editing panel and release resources. }; //some other codes ... ... } Please note that sender in B is from A, i.e. for every label clicked, A will create a different event handler and bind it to click event of finishing button. Isn’t it amazing? Although I cannot see any direct advantages from this change, but it appears more fun, more flexible and more readable. :-) WebPages downloadingIt seems very fundamental to download a web page from the internet. But it is a time consuming task so that often involves multi-threading to improve efficiency. Let’s see the typical solutions to download 10 WebPages in C++ and C#. In C++, to download a webpage, we should first create a new thread explicitly, assign it a webpage, and then start the thread. Fortunately, there are no interactions among the threads, so this simple scheme works. Let’s see how we can deal with this task in C# with asynchronous functions: HttpWebRequest req = HttpWebRequest.Create(http://g.cn); req.BeginGetResponse((IAsyncResult ar) => { HttpWebResponse rsp = req.EndGetResponse(ar); Stream stream = rsp.GetResponseStream(); //Redirect the stream to a file … … }); We can see that HttpWebRequest class has a method named BeginGetResponse, which takes a function as parameter. This method will create a thread to do the work, and when the work finishes, the function as parameter will be invoked, in which we can deal with the result. This code in C# does the same thing as previous C++ one, in spite of the built-in load balance feature of C# threads. But it shows a new way to write codes if functions can be created, destroyed, passed as parameter dynamically at runtime, just as other kinds of variables do. Of course these two samples do not demonstrate most fundamental differences between imperative programming and functional programming. But they do demonstrate some differences in how we treat functions; at least provide some fun solutions. It makes no sense to judge which one is better, but having more candidate solutions is not a bad thing, right? (To be continued… We haven’t seen how to write parallel code with functional languages yet. And I personally insist that functional programming style in these two samples shows much more flexibility and potential than traditional imperative ways. When finding more persuasive examples, I will show you! ;-) [Computing] C++ => C# => F#, more functional, more parallel (1)On DotNet Board of USTC BBS, orochi recommended a book named “Functional Programming For The Real World”. It demonstrates advantages of functional programming towards traditional imperative programming, which inspires me a lot. Here, I will first narrate my understanding of such advantages as a C# developer, then take an example to address how the programming pattern changes from pure imperative language as C++, to C# which contains a lot of functional programming elements, then to typical functional language F#. As I mentioned in this blog before, functional programming is another programming paradigm besides imperative programming. It is based on lambda calculus, which is as descriptive as a Turing machine. Anyway, this statement demonstrates nothing valuable. Let see the differences between imperative programming and functional programming in detail: 1. Function is treated as the fundamental element in functional programming. In functional programming languages, functions can be treated as independent variables (instead of considered as a pointer in languages like C++), can be created and destroyed dynamically like other kinds of variables, and can have no name (i.e. anonymous). This is a difference, but I admit that I haven’t found any advantages caused by this point directly. L Maybe when finish the studying of F#, I can answer this question. :-) 2. Functional programs tell computer what to do rather than how to do it. For example, if we want to select all the costumers living in UK, we may write like this in imperative languages like C++: for(vector<CCustomer>::const_iterator i = Customers.begin(); i != Customers.end(); ++i) This code snippet tells the computer: first pick up a customer from the customer list, check whether he/she lives in UK, if so, copy it to a new list called UKCustomers, do noting otherwise, then pick up next and do the same thing, until all the customers are processed. We are quite familiar with such “algorithms”, huh? Let’s see how we tell the computer to do the same thing with C#: var UKCustomer = What we need to do is just tell the computer: to pick up all the customers living in UK. Tell the computer what to do rather than how to do, this is called declarative style, which is used by popular languages such as SQL, as well as functional programming languages. 3. Variables in functional languages are immutable, i.e. they cannot be changed after created and initialized. A little advantage caused by this feature is, the codes is easier to understand and with less ambiguous. For example, without reading the manual, we don’t know what is the result of rectangle.Inflate(int width, int height). Will it inflate the original rectangle or return a new inflated rectangle? Both the choices are reasonable. However, in the functional world, only the second situation is possible. Of course, this style is not intended to save the “mass world” of imperative languages, but to make it possible that: 4. Write parallel code very easily. As every object in functional language is immutable, there will be no blocking, no lock or no writing collisions in the multi-thread world. Furthermore, it becomes very cheap to recognize which part of the program can be executed currently, so that the automatic optimization by the compiler is possible. That means, we need only make some tiny changes, then the program can run on multi-core computer or clusters with compiler’s optimizations. Take the sample code mentioned in the o_Ops! T-Shirt: girls.Where(x => x.IsBeauty).All(x => { x.Hi(); return true; }); which means say hi to all the beautiful girls in a list named girls. If you are a mono-core computer, you have to select all the beautiful girls, then say hi to them one by one. It is very uncompetitive, isn’t it? So we can modify the program like this, with the help of Parallel Extensions of .NET, which you can download here: girls.AsParallel().Where(x => x.IsBeauty).All(x => { x.Hi(); return true; }); Now you can say hi to all the beautiful girls simultaneously, with enough cores. :-) is it simple? No explicit processes or threads, no configuration for MPI or OpenMP. Thanks to the immutability of LINQ in C# 3.0, parallel programming turns so simple. (to be continued… next part you will see how we accomplish part of a map rendering engine in C++, C# and F#, demonstrating how the functional programming paradigm makes parallel coding so easy and natural.) [Computing]Editplus C# IDE配置心得VS是到目前为止用过的所有C#编辑器里最好用的,可惜启动不快,而且新建Solution,Project忙半天,如果建的是WPF工程更要等上n久。当BBS回帖需要写个小程序时杀鸡用牛刀实在不爽。在Stephen的启发下,想起来自己原本就用Editplus搭了个C++编译器,为何不搭一个C#的了事?于是动手实现了一下。其中走了一些弯路,把教训记下来以报后人。 [Computing]C#中的函数式思想C#的Delegate, Anonymous function和Lambda expression让它不仅是一个命令式语言(如常见的C/C++/Java等),同时也有了函数式语言(如LISP/F#等)的一些特性。事实上,在C#中应用函数思想也能大大的简化编程。 例如在一个WinForm程序中,如果想让所有的控件在被单击时都显示自己的名字(本质是一个多叉树的遍历问题),只需要这样一个函数就好了: private bool AddClickHandler(Control control) 看上去也没什么神奇的地方,不过是函数的递归而已,但其中函数可以作为一个变量存在,可以没有名称,可以赋值和作为参数传递等已经是函数式语言的基本思想了。不管怎样,这些特性的确让编程的人方便了不少。 [Computing]用工具提升程序的性能和稳定性这里说的性能指的主要是运行速度,而稳定性则侧重内存泄漏方面。这篇文主要讲述利用VSTS的Performance Profiler寻找性能瓶颈并做出优化的大体思路,以及利用Power Toys for .NET CF寻找.NET mobile程序的内存泄漏点的基本方法。 性能一个程序性能不佳,亦即对于一定量的输入数据,得到结果耗时过长,原因不外乎拙劣的算法或蹩脚的实现。算法方面相对好查一些,毕竟时间复杂度大体估计下还是差不多的。对于实现方面的问题,空想式的理论分析显然不管用,而代码排查对于稍微大一点的程序就不切实际,更何况有时候误用了低效的函数自己都不知道,把代码看再多遍也没用。从另一个角度来看,一个上规模的程序性能之所以不好,一般都存在一个或数个瓶颈。换言之,它可能90%的时间都在某两个函数里面转,这样就算把剩下的10%优化的再好,程序的性能也上不去。因此,我们需要一个工具来告诉我们,程序的大部分时间都在哪个函数里,有的放矢方有可能百发百中。 稳定性虽然.Net Framework提供了自动垃圾回收等防止内存泄漏的功能,我们不需要像C++那样很小心,但内存泄漏的问题仍然是存在的。究其原因,是很多资源不是.NET能自动释放的,比如GDI的句柄,GDI+ Image的句柄(本质还是封装了GDI)等等。这些资源如果不及时手动回收,很容易引起系统问题。比如开发手机程序的时候经常碰到OutOfMemoryException满天飞,甚至系统菜单都不能正常显示的情况。
总的来说,这篇文介绍了两个工具,对Windows下桌面程序性能的提高和移动程序内存泄漏的检测提出了一些思路。可喜的是这些工作都有相配套的工具,可惜的是自动化程度仍略嫌不够,需要编码人员有较强的素养才能迅速解决问题。 对于BBS DotNet版上出现的C#没有技术含量的论调,这篇文似乎也能说明:虽然C#入门门槛低,但想真正写出一手好程序也不简单,此中高手同样能在公司到达“不可取代”的境界。 [Computing]我眼中的指针1. 指针的构成 我认为指针包含的信息不仅仅是地址这样一个数这么简单,它包含了2个信息:基地址和类型。为了便于理解,可以用类似C++的语法形式化为: class Pointer<T> { public long baseAddress; //基地址 T operator*() { //到BaseAddress为基地址的内存区域,取长度为sizeof(T)字节的内容。 //将其强制类型转换为T类型的数据,并返回。 } //指针的相减是ANSI C定义的操作,返回值是long型,其值即为指针的基地址相减 long operator-(Pointer<T2> pointer) { return this.baseAddress - pointer.baseAddress; } //由于和主题不甚相关,为了简便起见,此处对prefix ++和postfix ++不作区分 T operator++() { return baseAddress += sizeof(T); } } //全局取地址符 Pointer<T> operator&(T Value) { return new Pointer<T>() {baseAddress = Value的基地址}; } 注意*和++操作,它和指针的类型是相关的。 更明确地说,指针不是一个数据类型,它是一组数据类型,概念上相当于C++中的模板(template)。而这组数据类型间不同之处在于内含类型信息的不同。 2. 相关证据(1) C语言编译器中将int *与char *视为不同的类型如果直接将一个int *的变量赋给char *的话,VS2008会报: Warning C4133: 'initializing' : incompatible types - from 'int *' to 'char *' 当然我承认这只是一个warning,但考虑到C语言是一个弱类型语言,而int *和char *的确可以相互转化,这可以作为一个佐证。 (2) int *和char *即使针对同一地址,取出来的值也不同。类似的例子很好写: int a = 3; int *p1 = &a; char *p2 = &a; 然后可以看*p1和*p2并不相同,编译原理上讲过n次了。 (3) int *和char *做++操作的结果不同#include<stdio.h> int main() { int a = 3; int *p1 = &a; char *p2 = &a; printf("++p1 = %p\n", ++p1); printf("++p2 = %p\n", ++p2); return 0; } 结果为: ++p1 = 0012FF64 ++p2 = 0012FF61 二者值相同,做相同的运算,却得到不同的结果,也说明指针中包含类型信息。 3. 编译器的处理当然在C语言中,是没有类这个概念的,更没有运算符重载,模板等等。因此上述的指针模型是概念上的,实际中编译器并没有用面向对象的技术来实现这一概念。根据我的猜测,编译器是通过符号表来记录指针的类型信息,然后在翻译&, ++, *等操作时检索符号表,查看相应变量的类型信息,最后将其翻译成相应的add $4, p这样具体的指令。 4. 数组和指针我的观点是,数组名和指针是一回事。所谓一回事,就是在语法允许的前提下可以相互替代,而不影响程序的表现。我承认存在例外,在下文”取地址”中详述。 一维一维比较直观。比如char [] str = “DFSB”,换成char * const = “DFSB”,其支持的运算种类和结果(除了取地址操作外)都是一致的。不再赘述。 多维多维和一维其实差别不大。在上面常见的例子中,之所以得不到相同的结果,是因为没有意识到指针内含类型信息,采用了错误类型的指针。事实上如果把程序改成这样: #include <stdio.h> int main() { int a[2][3]; int (* const b)[3] = a; printf("a = %p\n", a); printf("a + 1 = %p\n", a + 1); printf("b = %p\n", b); printf("b + 1 = %p\n", b + 1); return 0; } 编译时warning都没报一个,说明a和b是相容类型,可以直接赋值或进行隐式类型转换,考虑到前面做的实验表明,不同类型的指针间是无法如char, int间那样自如转换的,因此可以认为在编译器的眼中,int [][3]和int (* const)[3]完全就是一种类型。 运行结果是: a = 0012FF4C a + 1 = 0012FF58 b = 0012FF4C b + 1 = 0012FF58 可见结果完全相同。 取地址在上例中,如果察看&a和&b的值,事实上是不一样的。我承认在取地址的时候直接看结果的话,数组和相应类型的指针不可互换的。但正如论述数组和指针变量的不同之处在于一个是常量一个是变量一样,这个例子也有点类似取巧的意思。这结果的不同并不是数组和指针的不同之处引起的,只是表达数组的这个常量没有占内存,而指针变量必然要占内存。是qualifier的不同引起了结果的不同,而不是说数组名和指针有本质的不同。 从概念上,也许我们可以将数组理解为一种数据结构(事实上C#/Java中都是这样看的),而数组名是数组的一个特殊属性,它代表一个指针,但不占内存。 越讲越复杂,其实我们要探讨的问题很简单,在C语言编译器下的各种语法现象中,数组名和相应类型的指针能否相互替换而已。现在这个问题有了确定的答案了:常见情况下是可以的,但也有特例不行。 externHily给的例子我也无法索解。很诡异。在VS2008下也能重现这个问题。但考虑到我extern一个变量,结果拿来用的时候里面的值已经不是定义的那个值了,我更倾向这是编译器或者语言标准的一个bug而不是feature。 也许我们能做的除了探索编译器为什么要这么做以外,只能像Cheng说的敬而远之,实用中尽量避开吧。 小结数组和指针在语义上显然不同,数组表示一个向量或者矩阵,而指针仅仅是一个变量而已。 从语法现象来看,数组名和指针大多数情况下可以互换。 从编译器的处理来看,还是有不同的。 5. 结论1. 指针不是long,它还包含类型信息,内含类型信息不同的指针不是同一类型的指针。 2. 数组名在大多数情况下可以用相应类型的指针代替而不会引发程序运行结果的任何变化。 3. 存在例外情况,此时数组名不可用相应类型的指针代替。 [Computing]WPF画三维折线图给定数据绘制3维折线图…这的确是WPF做出来的,而且并不复杂~神奇么? excelWorkBook.SaveAs(paramWorkbookPath, Excel.XlFileFormat.xlWorkbookDefault, paramMissing, paramMissing, paramMissing, paramMissing, Excel.XlSaveAsAccessMode.xlExclusive, paramMissing, paramMissing, paramMissing, paramMissing, paramMissing); 无数的paramMissing(是一个定义为Type.Missing的变量),看得都无语了,打得手也废掉。二是这个功能的实现要求用户的机器中也装有Office 2007。这两个缺陷都相当严重。所幸在VS2010/.NET 4.0都得到了解决,C# 4.0支持默认参数了,而VS2010的No-PIA特性也可以将用Excel COM库中用到的部分抽取出来,链接到IL文件中去。很好很强大。 [Computing]底层以前写过一篇狗屁CS本科知识体系,最下面是数理逻辑和硬件方面的东西,然后操作系统,编译原理,然后程序设计语言,最上面是各专业的专用算法。后来做了一些图像视频方面(相对高层)的东西,再看看网上盛传甚至自己之前深信不疑的啥啥编程境界,发现大家好像都觉得底层的东西更牛逼一些。不过觉得没道理啊,两个人,一个做可计算理论,够底层吧,一个做AI,够高层吧,可是仅仅从两个人的研究方向就判断一个比另一个牛逼一些是没道理的。 P.S. 古汉语真神奇。好像我想表达的每个意思几千年前就有人表达过了,还都能总结出来很优美精炼的语言。 [Computing]面对WPF,我又一次震精了闲逛MSDN,发现一个类叫做RenderTargetBitmap,只需要两句话就可以把任何一个Visual对象转换成一个图片以方便显示和输出。 [Computing]现在的编译器真是NB啊复习编译原理,前年意云大神出了一个题,是这样的: P.S.感谢stephen同学的灵感启发和技术指导! [Computing]双核sin和Serialization双核sin Serialization [Computing]测试的一些体会在写完一个类之后,做一下测试是在所难免的。这个文想讨论一下以往个人采用的一些做测试的方法,并从他们的优劣之处加以比较。只是需要注意的是,这里的测试和调试的含义并不相同。测试主要针对于整个类,打进去输入看看输出对不对,在这篇文里它仅仅是一个check的过程。至于调试,则发生在测试失败时,此时就需要跟踪到类的内部逻辑中去,看一条条语句是如何执行的。二者的概念和所应用的工具都是不同的。当然,这两者也没有什么本质的区别,只是粒度不同而已。事实上对某一个类的测试也可以看做是对整个程序的调试。 P.S.小自恋一下,这些感触没有经历过七八万行代码的经验和不断的思考,一般学生,比如物理系啊考古系啊的还真没有~ [Computing]日干代码三千行 不辞长做小琐神昨天(0505)从早上7点眼皮一扒就搞个小键盘敲敲敲,上课敲,下课敲,吃饭敲,敲敲敲敲敲敲敲,终于一天把数据库实验3写完了。写了将近3000行代码。好吧,我承认,其实我写的只有2300+行,但是编译xaml是会生成c#代码的,这个实验的程序生成了600行左右。所以像我这样好大喜功的人就装作不知道,贪天之功据为己有啦啊哈哈哈~ 我也是日产3000行的人啦~~ 恩,以后不要再叫我代码民工头了,作为一个有理想有抱负的青年学生,要向着代码包工头的康庄大道奋勇前进! P.S. 大肆吹捧微软的技术总是很容易让人,尤其是业内人士本能的不爽。如有得罪,多多包涵!唉,实在不喜欢这种顾虑的感觉,在blog不想戴面具的说~ [Computing]C#占用内存的问题用C#写程序的时候,每每一开程序就占用20MB+的内存,写个稍微复杂一点的WPF程序就占到100+MB。相比之下,用MFC写的程序占用的内存则相当少,20+MB的内存占用就比较多了。这点曾让我相当头疼,毕竟内存占用也是衡量程序质量的一个比较重要的指标。 在读了"C# 3.0 in a nutshell"后,终于明白了是怎么回事。Garbage Collector并不是每时每刻都在回收垃圾的,尤其是对于类似C/C++中的栈型变量,即使在程序控制已经超出这些变量的作用域时,Garbage Collector也并不立即回收空间。这有两方面的考虑:一方面,既然系统有空余的物理内存,为什么要牺牲一定性能来花精力来保持一个较小的内存占用呢?让那些空闲的物理内存得到充分利用,为性能的提升做自己的一份贡献岂不很好?从另一个角度—Lazy Calculation--的角度来看,选择性的回收当前不再需要的内存需要较大的代价,而程序结束后一次性释放整个进程的内存则很简单。若仅仅当系统真正需要内存的时候才进行内存回收,就能讨一个巧:如果系统内存一直很充裕,就一直不需要进行垃圾回收,这部分时间就省下来了,程序的性能也就变相上去了。而最坏的情况不过是在运行中系统需要这些内存,那时再回收也不会有什么损失,不过是不能讨到巧罢了。(Lazy Calculation的思想在"More Effective C++"中也有介绍,在很多程序或系统,包括linux中有重要应用)。 我们可以改动一些C#的程序来验证CLR的确是这样做的。下面的语句可以得到当前程序真实的内存占用量: 对于一个现有的WPF程序做实验,可以知道在某时刻其真实内存占用量为52MB,而此时用Windows XP的Task Manager观察则得到它的内存占用为100+MB。其中多出的这部分余量就是已释放但尚未回收的垃圾空间。此时系统的空余物理内存还有60%+,所以Garbage Collector没有进行回收。而若保持这个程序继续运行,打开一个占用大量内存的程序(如Visual Studio),则可以发现Task Manager中显示该程序的内存占用变成了52MB+,这是一个比较理想的结果,说明这个程序真实占用的内存不会超过53MB,在某种程度上验证了上文的理论。也说明C#的大内存占用并不是一个弊病,而是一种合理利用系统资源的手段。这和某些linux爱好者的观点是一致的:linux内存占用比windows xp要大,是合理利用系统资源这一设计意图的体现而并非bug。(当然,Garbage Collector对于回收垃圾的时机的选择和很多因素相关,并不仅仅受系统空闲内存的影响) 如果看Task Manager里面大内存占用实在不爽也可以用强制垃圾回收的办法解决。对于上面的程序,如果在统计前调用GC.Collect();强制回收内存,在Task Manager中显示的内存占用就变成了37MB。这个比较诡异,竟然小于程序内部统计得到的数值。根据google的结果,这也许是Windows XP下的Task Manager对CLR程序的统计有问题所致,据说在Vista下问题有所改善,不过以我的小破本Pentium M 1.2G + 768MB RAM的条件怕是没有条件实验证实了。 当然,即使做了这样的优化,这个C#占用的内存也有50MB左右,和C++程序还是有一定差距,这也许是CLR程序运行的原理决定的,其中有.NET Framework占用的内存。这篇文只能指出C#程序的内存占用也许没有Windows XP下Task Manager说的那样夸张,并提出了一种验证的方法。至于如何进一步减小C#程序的内存占用,也许还需要更精确可靠的内存统计手段,更新版本的.NET Framework和更详细的C#内存分配原理说明。 [Computing]OpenCV & EmguCV配置心得OpenCV在计算机视觉领域可谓无人不知,无人不晓。最近也开始接触这个库,可是由于OpenCV是为C/C++程序员设计的,里面含有大量复杂的class和struct,如果按照传统C++/.NET混合编程,也就是通过DLL结合Marshal来干的话会死人的,所以选用了高人改写好的.NET版本:EmguCV。选择这个版本而并非OpenCVDotNET或者SharperCV是因为它支持的功能更多而且也一直在更新。在配置的过程中遇到了不少烦人琐屑和诡异的问题,总结如下: OpenCV的配置相对简单。目前能下载到的稳定版本是1.0,Windows用户拿到是一个含源码的二进制安装包。直接安装就好了,安装文件会自动把库的bin目录加到Path变量里去。如果做C++工程还需要在Linker和VC++ Directories里添加相应文件,和其他库的使用没什么区别甚至更简单,此处不述。 EmguCV的配置则复杂一些。目前的最新版本是1.5.0.1,解压得到的是由C/C++编译出来的OpenCV 1.1原生DLL和一系列.NET接口DLL。网上有文章说EmguCV依赖于OpenCV,但没有说明不安装OpenCV能否使用EmguCV,推测不装应该也能用(未实验证实)。如果下载下来后直接新建一个C#工程,用CVInvoke写个HelloEmguCV,再把EmguCV的几个DLL文件加到References里面去,的确可以正常编译,但运行时会有Exception,说无法打开cxcore110.dll。这个问题折腾了好一段时间,后来才反应过来这是因为Emgu随附的OpenCV所在目录没有加到Path变量中去,所以会找不到文件。将随附的OpenCV库文件拷入刚才装的OpenCV 1.0的bin里面去后这个问题解决了,但会报另一个错说DLL文件无法正常初始化。这个就比较好办了,用过VC的同学应该都遇见过,是没有装VC重分发包所致。但是OpenCV是跨平台的呀,怎么会用VC来写…实在很诡异。上网搜了一下,还真的需要VC++05 SP1的RedistPackage。安装以后问题解决…一直想不明白…不过好在现在Samples里面的例子,不论是控制台的还是WPF的都可以正常编译运行了:-) P.S.感觉EmguCV和.NET,尤其和(Managed)GDI+结合得相当紧密,不像DirectShow.NET自成体系,什么数据结构都要用自己的。赞一个~ [Computing]如C++般飞奔的C#用C#实现一个图像边缘检测算法,处理一个300*375的图片竟然用了2.38s,吐血。用VSTS的代码分析工具发现,GDI+里面的GetPixel()这个函数竟然占用了运行时间的53.92%而SetPixel()占用了7.14%,光是图像的读入和输出就占了60%+(见图1)。这样的速度可受不了。稍微分析一下就能得出原因,Bitmap是一个通用类,可以用来处理多种图像格式,因此GetPixe()l和SetPixel()的效率自然会在诸多分支跳转中受到拖累,再加上这两个函数被频繁反复调用,对程序效率的影响尤为明显。再深入想一想,实际上这样的函数无非是把一块内存复制并返回而已,因此可以考虑直接深入Bitmap的内部用指针糟吊搞,这样不仅可以省去反复调用函数压栈弹栈内存分配等琐碎工作,也能够在图像格式已经确定下来的前提下节约分支跳转的时间。
放狗搜了一下,Google大神告诉我们,其实微软已经考虑到了这样的效率问题,提供了两个东西,一个叫做BitmapData类,专门给需要效率的同学实现上面的思路,一个是传说中的Unsafe Code支持,使得C#中仍然可以使用指针,为BitmapData的使用创造了条件。至于细节上如何使用这个类,线程安全性,内存如何对齐等问题,详询10086,此处不述。或者也可以拜读这位同学的文章:使用C#进行图像处理的几种方法。(此文赞一个,写得相当清楚) 由此可见,当效率不理想的时候,不要怨天尤人,不要怪C#或Java此类语言先天不足。而应该先用工具搞清楚问题究竟出在什么地方,然后抓主要矛盾,着重优化这几个函数。此外,有的放狗也是重要技巧之一。一开始用关键字"GDI+ 效率 C#",搜出来一堆关于"GDI+比OpenGL效率还高"的资料,后来换了"GetPixel C# 效率",就搜出了相关文章。此外在没有网络的时候也可以去看看MSDN,事实上如果认真阅读MSDN中Bitmap.Members这个文档的话,会发现LockBits()这个函数,深入探究下去也可以找到相当详细的资料和例程。总而言之,积极的态度,理性的分析和工具的合理利用是克服困难的有力武器。 [Computing]也谈羊与车问题有一个很经典的概率问题是羊与车问题,大意是在一个游戏中有三个门,只有一个门后面有车,另外两个门后面是羊。你想要车,但你不知道哪一个门后面有车。主持人让你随便选了一个门。比如说,你选择了1号门。但你还不知道你是否选到了车。然后主持人打开了另一扇门,比如3号。你清楚地看到3号门后面是一只羊。现在主持人给你一个改变主意的机会。请问你是否会换选成2号门?(更详细的问题描述与历史探寻请参见http://www.matrix67.com/blog/archives/73) 目前比较公认的结果是应该换,而且换了后得到车的概率是不换的2倍。看起来的确有点违背直觉。 以前理解这个问题的方法是:亲自做实验。当时找了三个水瓶盖和一个透明胶,叫一个同学来呆呆的问,你指一个,换不换?再指一个,换不换?做了几次实验豁然开朗,确然是这样!应该换,而且换了之后概率的确是不换的两倍。这里卖个关子不说为什么,有兴趣的孩子可以自己去做做看。 最近在看Bayes理论(http://en.wikipedia.org/wiki/Bayesian_inference),发现用Bayes理论也可以轻松的解决这个问题,而且更有说服力。 如果用P(Car)代表你刚开始选中的门里有车的概率,用P(Sheep)代表主持人打开的门里是羊的概率,用|表示条件概率,那么由Bayes公式有: 其中P(Sheep|Car)表示在你选中了车的条件下,主持人打开的门里是羊的概率,显然是1。由前面的假设也显然有P(Car)=1/3。但是,关键来了,P(Sheep)并不是2/3,因为主持人是事先知道哪个门里有车的,他显然不会选一个门给你:哈哈,SB了吧!看看,车在这里面!然后问你换不换。所以P(Sheep)=1,也就是说主持人必然选一个有羊的门给你。正因为这个原因,P(Car|Sheep)=1/3而不是1/2,它的改变正是因为新的证据的出现。这是Bayes理论的基本思想。 P.S.好吧,我承认这个Blog也开始写数学的东西了。恩恩,数学的重要性竟然才被发掘出来,惭愧啊。 [Computing]C#和C++混合编程[Computing]C#和C++混合编程 C#代码: 其中的细节,比如int和char等数据类型在C++和C#中占用的空间不同等等CLR会自动处理。(主要是通过Marshal类自动处理) 这样的调用还支持调试。打开C#工程的Properties,在Debug选项卡中勾选Enable unmanaged code debugging即可启用C++代码调试。这样在调试模式下,调用这个函数时可以继续按F11跟进函数内部进行调试。 return HBitmapToReturn; C#代码(用户界面采用WPF,略去相关声明和引用): [DllImport("MFCDll.dll")] IntPtr hBitmap = GetABitmap(dialog.FileName); for (int i = 0; i < StaticElementNumber; ++i) return true; extern "C" AFX_API_EXPORT int GetArrayElementNumber() C#代码: [DllImport("MFCDll.dll")] //Show the data. 有了这三个功能,一般来说C++代码复用到C#平台上就是比较简单的事情了。 |
|
|