分類彙整: 作業系統

我愛用的免費軟體 (my favorite freeware)

最近已經快滿三年的 Thinkpad X60 跑得越來越慢，又常常當機，所以找個比較有空的時間就把它重灌 Windows XP Professional，順便作一下筆記

灌好作業系統後，可以先把 ThinkVantage System Update 裝起來，這樣 Thinkpad 相關的驅動程式及軟體都可以直接從這邊更新回來。對了，記得不要裝 Client Security Solution 因為裝了真的是自找麻煩! 另外 System Migration Assistant 和 Rescue and Recovery 我也沒裝..

接下來就是安裝應用程式了，儘可能都以免費的軟體為主:

PicPick 是一個很強大的截圖軟體
XnView 看圖及簡單的圖形處理
Paint.NET 是一個短小精幹的繪圖軟體
FileZilla 支援 FTP 及 SFTP 的檔案傳輸軟體
7-Zip 檔案解壓縮工具
K-Lite Codec Pack 播放各種影音檔案
Adobe Reader 閱讀 PDF 文件

Alcohol 52% Free Edition 虛擬光碟 (裝免費版的即可, 可選擇不裝贊助軟體)
Avira AntiVir 掃毒軟體
Unlocker 強制解鎖
RocketDock 像蘋果的工具列
新酷音輸入法 (最新版: 0.3.4.8)

其他網路工具包含:

Google 提供的工具:

另外微軟的 PowerToys for Windows XP 也有幾個不錯的小工具:

CmdHere
PowerToy Calculator
Tweak UI

還有之前微軟有提供一個 Microsoft Chinese Date and Time，是一個我很愛用的農民曆及世界時間工具，但現在微軟似乎把連結拿掉了，不過網路上搜尋一下就找到了

一個長整數各自表述 (in 64-bit system)

7 則留言

Size of long integer may be different between 64-bit systems (一個長整數各自表述)

不知道是不是我太落伍了…

我一直以為 C/C++ 下面 short, long, long long 三種資料型態都固定是 2, 4, 8 個 bytes 大小。只有 int 這個資料型態會因為 16-bit/32-bit 系統的不同而變成 2 或 4 bytes 的大小，所以理所當然 int 在 64-bit 的電腦也應該會變成 8 bytes (64-bit) 的大小囉 ?!

在整理前一篇文章《Bypass the 2GB file size limit on 32-bit Linux》的時候，讓我驚覺在 64-bit 的系統下，long 的長度也是各自表述的！

首先，int 的大小即使到了 64-bit 的機器上，大部分的系統仍然使用 4 bytes 的大小而已，這主要是為了避免程式從 32-bit 系統轉換到 64-bit 系統需要修改太多地方

再來，請參考 Wikipedia: 64-bit data models 的說明

絕大多數的 UNIX 系統在 64-bit 下面採用 LP64 這種 data model，這時候 long 就不再是固定為 4 bytes 大小，而是變成 8 bytes 的大小了！

然而，Win64 卻不是使用 LP64，而是採用 LLP64 這個 data model，這時候 long 的大小仍然還是 4 bytes

Many 64-bit compilers today use the LP64 model (including Solaris, AIX, HP, Linux, Mac OS X, and IBM z/OS native compilers). Microsoft’s VC++ compiler uses the LLP64 model.

兩種 data model 的最大差異點就是 long 這個資料型態的大小，LP64 是 64-bit，而 LLP64 則是 32-bit

LLP64 data model 基本上可以說跟 32-bit 的系統一樣，唯一差別只有位址(pointer)改成了 64-bit 而已。資料物件(class, structure) 等如果沒有包含 pointer 的成員的話，整個物件的大小是與 32-bit 系統一樣的！

而 LP64 則是除了位址(pointer)改成 64-bit 之外，long 的大小也變成了 64-bit 大小。所以在 UNIX 下面，要把 32-bit 程式 porting 到 64-bit 可能要比 Windows 多花費多一點功夫。

所以呢，我們觀察到兩個問題影響著程式的相容性

在 UNIX 下面，long 的大小在 32-bit 與 64-bit 的系統下是不一樣的
同樣是 64-bit 系統，UNIX 與 Windows 對於 long 的大小看法是不一致的

為了使程式在 32-bit 與 64-bit 之間以及 UNIX 與 Windows 之間的相容性提昇，改用固定長度的資料型態是寫程式的一個好習慣

在 UNIX 下面，我們可以改用 stdint.h 這個 header file 中對於資料型態的定義:

int8_t     8-bit signed interger
int16_t    16-bit signed interger
int32_t    32-bit signed interger
int64_t    64-bit signed interger
uint8_t    8-bit unsigned interger
uint16_t   16-bit unsigned interger
uint32_t   32-bit unsigned interger
uint64_t   64-bit unsigned interger

在 Windows 下面，則改用下面這些整數固定大小的資料型態

INT8       8-bit signed integer
INT16      16-bit signed integer
INT32      32-bit signed integer
INT64      64-bit signed integer
UINT8      8-bit unsigned integer
UINT16     16-bit unsigned integer
UINT32     32-bit unsigned integer
UINT64     64-bit unsigned integer

絕對不要再使用 int 和 long 了！

尤其是寫網路程式時，很可能 client 是 Windows 而 server 是 UNIX，然後又有 32-bit 及 64-bit 系統混在裡面，一不小心就發生不相容的問題了…

當然，在 64-bit 的系統下寫程式，要考慮的絕對不只上面這些基本的資料型態。除了 pointer 的大小變成 64-bit 外，許多系統內建函式會用到的 size_t 及 off_t 的大小也變成 64-bit 了…. 寫程式時若有用到這些資料型態，需特別注意，尤其是 casting 時，千萬不要用 32-bit 的整數去裝這些資料，免得造成不可預期的結果！

最後提供一個小程式讓你得知你的系統主要資料型態的大小

#include <stdio.h>
#include <sys types.h="">
int main()
{
        printf("sizeof(short)     = %d\n", sizeof(short));
        printf("sizeof(int)       = %d\n", sizeof(int));
        printf("sizeof(long)      = %d\n", sizeof(long));
        printf("sizeof(long long) = %d\n\n", sizeof(long long));
 
        printf("sizeof(size_t)    = %d\n", sizeof(size_t));
        printf("sizeof(off_t)     = %d\n", sizeof(off_t));
        printf("sizeof(void *)    = %d\n", sizeof(void *));
}
</sys></stdio.h>

參考資料:

Bypass the 2GB file size limit on 32-bit Linux

2 則留言

Bypass the 2GB file size limit on 32-bit Linux (在 Linux 上面突破 2GB 的檔案大小限制)

在 32 位元的 Linux 上面寫超過 2GB 的檔案會發生錯誤，甚至導致程式終止執行

這是因為 Linux 的系統內部處理檔案時用的指標定義為 long，而 long 在 32 位元的系統上的大小為 32 位元，因此最大只能支援 2^31-1 = 2,147,483,647 bytes 等於是 2GB 扣掉 1 byte 的檔案大小

64 位元的系統 (例如 AMD64 或 IA64) 則因為 long 定義成 64 位元，所以不會有問題..

#  if __WORDSIZE == 64
typedef long int int64_t;
# endif

不過在 FreeBSD 上面，即使是 32 位元的系統，也不會有 2GB 檔案大小的限制，這是因為 FreeBSD 內部處理檔案時，本來就是使用 64 位元的數字當作指標，所以不會有問題

因此在 32 位元的 Linux 上面，程式需要作一些額外處理才能正確寫超過 2GB 的檔案

我們先寫一個小程式來測試一下 (large.c)

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <signal.h>
#include <unistd.h>
#include <errno.h>
void sig_xfsz(int sig)
{
        printf("ERROR: SIGXFSZ (%d) signal received!\n", sig);
}
int main()
{
        int     i, fd;
        char    dummy[4096];
 
        signal( SIGXFSZ, sig_xfsz );
 
        unlink("large.log");
        fd = open("large.log", O_CREAT|O_WRONLY, 0644 );
 
        bzero( dummy, 4096 );
        /* 2GB = 4KB x 524288 */
        for( i = 0 ; i < 524287 ; i++ )
                write( fd, dummy, 4096 );
        write( fd, dummy, 4095 );
        printf("large.log: 2147483647 bytes\n");
 
        if( write( fd, dummy, 1 ) < 0 )
                printf("ERROR: %s [errno:%d]\n",strerror(errno),errno);
        else
                printf("large.log: 2147483648 bytes\n");
 
        close(fd);
        exit(0);
}

在 32 位元的 Linux 下面，以上程式編譯後若沒有特殊處理，執行結果如下:

# gcc -o large32 large.c
# ./large32
large.log: 2147483647 bytes
ERROR: SIGXFSZ (25) signal received!
ERROR: File too large [errno:27]

在寫第 2147483648 byte 的時候，程式會收到 signal SIGXFSZ，同時 write() 會回傳 -1 錯誤，errno 則為 27 (File too large)。更甚者，如果程式沒有像上面一樣去處理 SIGXFSZ 的話，內定的 signal handler 甚至會造成程式停止執行並產生 core dump

接下來，我們在編譯同一個程式的時候加入 -D_FILE_OFFSET_BITS=64 再試看看:

# gcc -D_FILE_OFFSET_BITS=64 -o large64 large.c
# ./large64
large.log: 2147483647 bytes
large.log: 2147483648 bytes

果然順利突破 2GB 的限制了!

而同樣的程式在 32 位元的 FreeBSD 下面，不論有沒有加這個定義，跑起來都是正確的

不過處理這些大檔案的時候，除了編譯程式時的參數不同外，有些函數的使用上也要作一些調整，例如 fseek() 與 ftell() 這兩個原本使用到 long integer 當作 offset 的函數:

1 2	int fseek(FILE stream, long offset, int whence); long ftell(FILE stream);

只要系統是 32 位元，即使是在 FreeBSD 下面，都需要改為使用 off_t 的版本:

1 2	int fseeko(FILE stream, off_t offset, int whence); off_t ftello(FILE stream);

在 Linux 下面，如果 _FILE_OFFSET_BITS 定義為 64，則 off_t 這個型態會自動轉成 64 位元的大小（在 FreeBSD 上面，off_t 本來就是 64 位元的大小)

每種系統支援大於 2GB 的檔案讀寫所需要的編譯選項都會有一些差異，即使是同樣是 Linux 也會因為 32 位元或 64 位元而有不同。有一個簡單的方法可以判斷，就是利用 glibc 提供的 getconf 來取得編譯(compile)以及連結(linking)時所需的參數:

# getconf LFS_CFLAGS
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
# getconf LFS_LDFLAGS 
 
#

上面是在 32 位元的 Redhat Linux 上面跑出來的結果，代表的是在這個系統上，若要讓程式支援 2GB 的檔案讀寫，編譯(compile)時需要加上 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 這兩個參數，連結(linking)時則不用加任何參數

參考資料:

Large File Support in Linux
LFS: Large File Support (Wikipedia)

Change the core dump file name in Linux and FreeBSD

發佈留言

Following the previous notes about enabling core dump, here’s a note about changing the filename of core dump.

In Linux (since Linux 2.6 and 2.4.21)
you can change the core dump filename from the file /proc/sys/kernel/core_pattern

         %%  A single % character
         %p  PID of dumped process
         %u  real UID of dumped process
         %g  real GID of dumped process
         %s  number of signal causing dump
         %t  time of dump (seconds since 0:00h, 1 Jan 1970)
         %h  hostname (same as 'nodename' returned by uname(2))
         %e  executable filename

Linux have a default core filename pattern of “core”.
Alternatively, if /proc/sys/kernel/core_uses_pid contains a non-zero value, then the core dump file name will include a suffix .PID (process id), ex: core.PID

In FreeBSD, sysctl variable “kern.corefile” controls the filename of core dump.

Any sequence of %N in this filename template will be replaced by
the process name, %P by the processes PID, and %U by the UID.

FreeBSD have a default core filename pattern of “%N.core”

You can include path in the filename pattern both in Linux and FreeBSD.
This make it possible to put core dump file in a separated directory.

Enable core dump in Linux and FreeBSD

1 則留言

Just a note.

You can enable core dump by:

[bash] edit /etc/profile

ulimit -c unlimited

[csh/tcsh] edit /etc/csh.cshrc

limit coredumpsize unlimited

You can disable core dump by:

[bash] edit /etc/profile

ulimit -c 0

[csh/tcsh] edit /etc/csh.cshrc

limit coredumpsize 0

On FreeBSD, you also need to check the setting of kern.coredump:

# sysctl -a |grep kern.coredump
kern.coredump: 0
# sysctl kern.coredump=1
kern.coredump: 0 -> 1
# sysctl -a | grep kern.coredump
kern.coredump: 1

You can enforce this setting in /etc/sysctl.conf

[2008/01/01] Thanks for the complement from gslin, kern.sugid_coredump controls the core dump for setuid/setgid process in FreeBSD.

消失的硬碟空間

發佈留言

話說某一天，一位同事發現某個在 UNIX 上用 C 寫的程式，跑一陣子後似乎會吃掉很多硬碟空間，吃掉的硬碟空間用 du 去算卻跟 df 的結果差異很大，而且把 process 停掉後，空間竟然又自動恢復正常了

最後，用 fstat 去仔細分析，終於找到原因：

已經開啟的檔案，即使開啟中被強制砍掉(unlink)，對原 file descriptor 持續寫入的部份仍會繼續佔用硬碟空間，寫得越多，佔用的空間也越多

實務上最常遇到這種狀況的就是 log rotation，尤其是 rotation 後的舊 LOG 是壓縮過的情況。因為經過 gzip 壓縮過後，原始的 LOG 會被刪除，只留下 XXX.gz。這個時候如果沒有人通知原來寫 LOG 的程式要重新開啟一次 LOG (重新寫一個檔案)，就會導致程式在不知情狀況下繼續寫 LOG，然後空間就莫名其妙被用掉了！

例如 FreeBSD 下專門作 log rotation 的 newsyslog 設定檔 (newsyslog.conf) 就有個欄位可以設定在 log rotation 後送一個 signal 給 process，而 apache (httpd) 就接受 SIGUSR1 來當作重新開啟 LOG 檔案的訊號（事實上對 apache 而言是 graceful restart）。很多人以為這只是為了讓 LOG 能繼續寫不會漏掉，但其實更重要的是：如果不這麼作，你的硬碟可能很快就爆掉啦…

我們可以寫個簡單的程式來測試一下這種狀況：

#include <stdio.h>
#include <fcntl.h>
int main()
{
        int     fd, i;
        char    cmd[32], buf[1024];
        memset( buf, 0, 1024);
        snprintf(cmd,sizeof(cmd),"df .");
 
        printf("==> open file for write and delete it ...\n");
        fd = open( "test-file.log", O_CREAT|O_WRONLY|O_TRUNC );
        unlink("test-file.log");
        system(cmd);
 
        printf("\n==> write 100MB to file ...\n");
        for( i = 0 ; i < 1000*100 ; i++ )
                write( fd, buf, 1024);
        system(cmd);
 
        printf("\n==> close file ...\n");
        close(fd);
        system(cmd);
}

首先，這個小程式會先開啟一個檔案，然後馬上砍掉它（但先不關閉檔案），接下來執行 “df .” 來查看目前硬碟用量。第二步驟是寫入100MB的垃圾資料到這個已開啟的檔案(file descriptor)中，然後再執行 “df .” 來取得硬碟用量。最後關閉檔案後，再執行一次 “df .”。執行結果如下：

==> open file for write and delete it ...
Filesystem  1K-blocks     Used     Avail Capacity  Mounted on
/dev/ad8s1d 144520482 28011428 104947416    21%    /home

==> write 100MB to file ...
Filesystem  1K-blocks     Used     Avail Capacity  Mounted on
/dev/ad8s1d 144520482 28111508 104847336    21%    /home

==> close file ...
Filesystem  1K-blocks     Used     Avail Capacity  Mounted on
/dev/ad8s1d 144520482 28011428 104947416    21%    /home

我們可以看到程式寫了100MB之後，空間真的被佔掉了，即使我們已經刪除這個檔案，且從目錄的檔案列表中無法直接看到這個檔案了。而當被開啟的檔案關掉後，這些空間也立即被釋放回來了

接下來我們把程式中的 df 改成 fstat，可以更清楚看到狀況

1	snprintf(cmd,sizeof(cmd),"fstat -f -p %d .", getpid());

這是最後的結果：

==> open file for write and delete it ...
USER     CMD     PID   FD MOUNT     INUM MODE         SZ|DV R/W
cdsheen  a.out 91475   wd /home 12694528 drwxr-xr-x    2048  r
cdsheen  a.out 91475 text /home 12694672 -rwxr-xr-x    7910  r
cdsheen  a.out 91475    3 /home 12694673 ----r-x--x       0  w

==> write 100MB to file ...
USER     CMD     PID   FD MOUNT     INUM MODE         SZ|DV R/W
cdsheen  a.out 91475   wd /home 12694528 drwxr-xr-x    2048  r
cdsheen  a.out 91475 text /home 12694672 -rwxr-xr-x    7910  r
cdsheen  a.out 91475    3 /home 12694673 ----r-x--x  102400000  w

==> close file ...
USER     CMD     PID   FD MOUNT     INUM MODE         SZ|DV R/W
cdsheen  a.out 91475   wd /home 12694528 drwxr-xr-x    2048  r
cdsheen  a.out 91475 text /home 12694672 -rwxr-xr-x    7910  r

Dada's Blog

Just for fun